Technology and Financial Reporting

Started by Waltzing, Jun 13, 2023, 10:53 AM

Previous topic - Next topic

0 Members and 1 Guest are viewing this topic.

Ferg

#15
Quote from: Waltzing on Jun 16, 2023, 10:35 PMThe only Developer to do a Bubble sort in VBA outside PASCAL!!!!

Classic !!!

 ;D  Classic indeed.  Pascal was my second language, COBOL was my first.  From memory Pascal forced indentation (Polish notation?), and I still use it.

Waltzing

#16
Id have to get my Pascal book out to look.

https://en.wikipedia.org/wiki/Polish_notation
 
Once you have the tables in Excel an extraction to XML Tables might be possible.





Waltzing

#17
Ok FERG thinking about bringing in the imported Book mark table in Word into an object with a sortable list.

QDocumentBookmarks    queue,type
BookMarkNo       long,name('BookMarkNo')
BookMark           string(255),name('BookMark')
RowColumnTableref        &RowColumnTable         
                                        End


DocumentBookmarksref. RowColumnTableref       &= New(RowColumnTable )   

RowColumnTable          is a Class that handles the data in the table.

This will hold the book marks and the tables contents for  processing into a larger XML Database solution...

Now cant wait to try this sometime in the next few weeks if we get the time...

No bubble sort required.

The queues Lists can be sorted on demand by a built in SORT command.

This could be used for statements also like Dividend statements in PDF ect.

Waltzing

#18
  HLG PDF reports are locked up to stop conversions to  WORD.

Unlocking can be done by printing the PDF to PDF.

Not sure if bookmarks are intact.


Waltzing

Ok printing the PDF was manual but i think it can be automated by using the standard OLE Active X late binding commands sent as text commands. WIll demo later next week i hope..

Got full Table and paragraph XML data tables working..

Print to PDF came up on windows 10 and 11 as standard options under Print in Office.

Waltzing

We think we have fixed the problems using the PRO version of ADOBE and created a index template approach to DIV PDF where the company has a paragraph template and then on importing and processing PDF's converted to WORD using then PRO VERSION the positions of the paragraphs are then mapped to the templates.

And you have A template learning processing solution for Dividend statements.

Now to apply that to learning Public Company accounts.

WZ

Waltzing

PDF window 11 in Edge showing default PDF option in the printer options.


Ferg

Gotcha - it was the short form version of the HLG A/R that was locked up tight and won't print.  The glossy version has the print option so that's all good.  It's going to be a long winter - these are complex underlying data structures.  Plus look at the partial lack of tables in FRW - half the data is in a table, the descriptions are not....

Waltzing

#23
Yes i noticed this on DIVIDEND statements and that is after processing using the ADOBE PRO... maybe they will allow AI ?  AHHH I GOT IT...

anyway we have now a structure where you program the Document your processing with some commands for documents...

We are going to HOST IT in our SMART VARIANT language which means you can develop it at runtime ...

Which means you can train it with a GUI...that you create ... commands site in simple text file and you dont need to write VBA...

made a lot of progress at the weekend in auckland.

The trick is to manage the data extracted from the PDF's and allow it to be easily manipulated for statistics and for business related matters.

For this we have stored the data from tables and paragraphs in what is called a sortable list or queue which can be automatically store as XML data tables..

hence we have data stored for processing in loops and spreadsheets from XML data Tables. Simple management of the extracted data. 
 

Waltzing

I forgot to say the real trick here is the consolidation of multiple entities for reporting and for that we can add to the project a project for processes multiple entities onto one page for P&L balance sheet processing...

It currently only handles 10 entities on one page and has 3 balances per report column set but we could take that set of classes and set the data table as a single reporting column not 3.

Instead of YTD OB and QTY we will see if we can supply a Sortable List with one column for YTD.

Now imagine that kind of power in your hands for reporting on multiple public companies on one page...

Its based on a project we have that lets us report on multiple private entities on one page..

Power over managing entity data is the power we like the most...

Waltzing

CPP large language models... and the possibility of running them locally via your scripting language models.

Once you have this kind of power turning up on your local machines you start to look at big data processing locally.

It doesnt take an ESTN to realise that PHD students are going to be all over this and the divide between the HAVES and HAVES nots is going to be a whole new society...

https://blog.maartenballiauw.be/post/2023/06/15/running-large-language-models-locally-your-own-chatgpt-like-ai-in-csharp.html

The FAT LADY hasnt sung yet but shes warming up..


Waltzing

#26
Processing the converted PDF documents into there paragraphs and Tables is simple once you have the data and its separated from any strange characters that might contaminate your data...

The it simply a matter of using some macro instring() to find the phrases your after and tables for transfer into data structures that might house the information you want to manage....

The instring macros can take phrases and you can instruct your software to learn from an importation GUI...

Paragraphs have there positions in the VBA object model and they are the same for many documents that are standard formats...

Your learning template needs some fancy commands of course but this scrapping and matching is the next step in setting up micro learning tables and with imbedding of large language models in scripting languages the future is about connecting your information to these models and storing the information for future use...

here in this example data is stored in plain XML for farther processing and matching. Once data is XML tagged you can farther pass this data sets on to any software you want.

 

 

Waltzing

#27
the new meta Base Template system does not yet read documents and understand them until it is trained...

It take document apart and then matched there composition to its private and public understanding of a Document Template. If can match the parts of a document to its PP contexts. (Public and Private Word base EQU tables). EQU stands for the word information equates. These are then matched back to the processed documents templates which can process and execute a documents parts. Sort of turning the parts of a document into an interactive command driven Analysis.

This process then decides when and if to link to an LLM for farther processing. This reduces the processing over head on using LLM's which is cheaper for the users and the environment as LLM's are going to be anything but Green!

You compose the template from parts of a document type.

and then when linked up to a Large Language scripting model it might in the few years time do a lot more such as read company reports creating a series of data tables before processing those tables and sending parts of them to LLM's for farther processing.

Right now most people use LLM by throwing documents at the LLM.

The Meta base Template system pre processes documents before then selectively sending information to an LLM for farther Analysis.

The benefit from this approach is that large meta models can be built to support LLM and statistical data can be stored locally.


Waltzing

There is no doubt that from an investor prospective local and global server language models are the "way of the future"

https://www.youtube.com/watch?v=4_Pbx9mvWPY

Amazon has been working away on this for a while and Microsoft does not have its own chips but must rely of others. Not that that is a bad thing.

https://www.cnbc.com/2023/08/12/amazon-is-racing-to-catch-up-in-generative-ai-with-custom-aws-chips.html


Waltzing

In regard to Financial Accounting standards it time these were throw out along with the credibility of the Accounting society.

Multi dimensional data constructs have been available on computer hardware and software stacks for 30 years or more and yet accounting remains single dimensional and therefore completely polluted and near useless..

It time for developers to take the matter in hand and just get rid of the baby and the bath water....

Bye bye single dimensional accounting and F Gen AI wont need it either....