Monday, March 5, 2012

Target application: receipts


I have this goal to record each and every expense in the household and categorize them for budgeting. Unfortunately, for the past four years I've failed to meet that goal. The problem is it's so difficult to keep up with entry of the paper receipts - this involves a great deal of context switching between paper and screen to find where the date, amount, and destination of each expense is.

So I just don't do it. Instead, I pile up receipts in small boxes scattered around my office.

But now I have this nifty little photo scanner, the PanDigital Photolink. It's great for small stuff; it just sucks things through and stores the scan file onto an SD card for your viewing pleasure, at about 300 dpi. Scanning the receipts is easy because there are no context switches (at least this will make it possible to free my desk of the many small boxes), and then I want to do the following:
  • Delete mis-scans (if the receipt doesn't quite engage, sometimes there's a little blurb that isn't actually anything). This I can do manually after each scanning session.
  • Shrink the files - I don't actually need 300 dpi quality for these, and at about 400 kB a pop, my 80's self is offended by the size of the data.
  • Merge any two-scan receipts - the scanner gives up after about eight inches, knowing it's not actually a plausible length and assuming your photo has jammed. For long receipts like grocery shopping at Meijer's, I'll scan receipts in two sections. Using physical scissors. Then I want to group them as a single receipt.
  • Ideally, straighten the scan up. The receipts are too narrow for the scanner to detect them if they're against the guide rail of the bed, so I scan down the middle of the bed - the result is that they're all slightly slanted. Some move a little during the scan, so they're also bent. Not much to do about that.
  • Ideally, OCR them.
  • Using a combination of OCR and a viewer application (this would be a simple GUI with a viewer for the graphic and a record entry for the data), verify any OCR'd data or enter the data if OCR can't get it.
  • Index everything into a SQLite database, along with non-receipt expenses such as checks or online payments. Categorize and report using something analogous to the Access database I built in the 90's.
That's pretty simple. It should essentially be nearly as simple to write this in Decl as it was to explain it just now.

No comments:

Post a Comment