User Tools

Site Tools


budget:start

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
budget:start [2012/10/12 07:32]
127.0.0.1 external edit
budget:start [2013/03/27 16:25]
andy [Classifying Transactions]
Line 2: Line 2:
  
 A monthly budgeting application,​ to plan a budget and stick to it. A monthly budgeting application,​ to plan a budget and stick to it.
 +
 +===== Budget Analyser =====
 +
 +The planning process can easily be done with a spreadsheet for now, but budget analysis is going to be time-consuming that way. So, the most beneficial application to write would be a simple budget tracker, which uses something like a naive bayesian classifier to allocate a category to each transaction and then produce monthly totals in different categories.
 +
 +==== Classifying Transactions ====
 +
 +A transaction can be expected to consist of the following minimum bits of information:​
 +
 +  * A date.
 +  * An amount of money, which may be positive (for credits) or negative (for debits).
 +  * An identifying string or recipient.
 +
 +Transactions may also optionally include:
 +
 +  * An updated balance.
 +
 +All of these items can be used in the classification of the transaction,​ but first they must be transformed into appropriate tokens. This is done separately for each item as follows:
 +
 +^ Date | The month, day of the month, day of the week and the nth occurrence of that day within the month are all converted into tokens. For example, **2013-03-27** might yield tokens **''​mon-mar''​**,​ **''​mday-27''​**,​ **''​wday-wed''​** and **''​nthday-4''​**. |
 +^ Amount | A logarithmic scale is used to classify transactions,​ using base 2 for simplicity. To prevent weaker indicators around base 2 boundaries, the next log up is also included. For example, the amount **£38.15** would yield tokens **''​amnt-2^5''​** and **''​amnt-2^6''​**. |
 +^ Description | The description is split into tokens of alphanumerics using any other character as a separator and forced to lowercase. For example, the string **''​BRGAS-ELEC AC110298738''​** would yield tokens **''​brgas''​**,​ **''​elec''​** and **''​ac110298738''​**. |
 +
 +These tokens are concatenated into a single list and these are all treated equally for the purposes of the classifier. This allows the classifier to learn which are the reliable indicators of any particular categorisation.
 +
 +
 +===== Attic =====
 +
 +The previous contents of this page, for posterity.
  
 FIXME: //Delete this// \\ FIXME: //Delete this// \\
budget/start.txt · Last modified: 2013/03/27 16:25 by andy