xactfetch

dustin

Author	SHA1	Message	Date
Dustin	bdcb8c93b6	xactfetch: Suppress asyncio InvalidStateError dustin/xactfetch/pipeline/head This commit looks good Details There is currently a [bug][0] in the Python Playwright API that causes _asyncio_ to raise an `InvalidStateError` occasionally when the `PlaywrightContextManager` exits. This causes the program to exit with a nonzero return code, even though it actually completed successfully, which will cause the Job to be retried. To avoid this, we can catch and ignore the spurious exception. I've reorganized the code a bit here because we have to wrap the whole `with` block in the `try`/`except`; moving the contents of the block into a function keeps the indentation level from getting out of control. [0]: https://github.com/microsoft/playwright-python/issues/2238	2024-07-11 22:34:49 -05:00
Dustin	3ff18d1042	container: Add secretsocket, chase2fa scripts While the original intent of the `secretsocket` script was to have `rbw` run outside the `xactfetch` container, that is only useful during development; both processes need to run in the container in Kubernetes.	2024-07-11 21:50:27 -05:00
Dustin	0f9b3a5ac5	secretsocket: Respect SECRET_SOCKET_PATH The `secretsocket` server will now create its IPC soket at the location specified by the `SECRET_SOCKET_PATH` environment variable, if set. This way, both `secretsocket` and `xactfetch` can be pointed to the same location with this single variable.	2024-07-11 21:50:27 -05:00
Dustin	e4742f1c6e	container: Optimize layer cache usage With the addition of ancillary scripts like `entrypoint.sh`, the `COPY .` instruction in the build stage results in a full rebuild of the final image for every change. To avoid this, we now only copy the files that are actually required to build the wheel. The other scripts are copied later, using an intermediate layer. This avoids needing a `COPY` instruction, and therefore a new layer in the final image, for each script. Hypothetically, we could use `RUN --mount=bind` and copy the files with the `install` command, but bind-mounting the build context doesn't actually work; SELinux prevents the container builder from accessing the source directory directly.	2024-07-11 21:50:27 -05:00
Dustin	76cb7c7958	container: Rebase on dch-base	2024-07-11 21:50:27 -05:00
Dustin	bef7206642	entrypoint: Start secretsocket server if needed If the `SECRET_SOCKET_PATH` environment variable is not set, or refers to a non-existent path, then we assume we need to manage the `secretsocket` server ourselves.	2024-07-11 21:50:27 -05:00
Dustin	28fe49c2b2	xactfetch: Save Playwright trace for failed runs Playwright has a nifty feature called the [Trace Viewer][0], which you can use to observe the state of the page at any given point during the browsing session. This should make troubleshooting failures a lot easier. [0]: https://playwright.dev/python/docs/trace-viewer-intro	2024-07-11 21:48:47 -05:00
Dustin	9f113d6a3f	xactfetch: Switch to headed Chrome Earlier this week, `xactfetch` stopped being able to log in to the Chase website. After logging in, the website just popped up a message that said "It looks like this part of our website isn't working right now," with a hint that I should try a different browser. I suspect they have enhanced their bot detection/scraping resistance, because the error only occurs when `xactfetch` is run from inside a container. It happens every time in that case, but never when I run it on my computer directly. After several hours of messing with this, the only way I was able to get it to work is to use full-blown headed Chromium. Neither headless nor headed Firefox works, nor does headless Chromium. This is a bit cumbersome, but not really a big deal. Headed Chromium works fine in an Xvfb session.	2024-07-11 21:34:11 -05:00
Dustin	8de0d93eb1	xactfetch: chase: Handle SMS 2-factor auth When logging in to the Chase website with a fresh browser profile, or otherwise without any cookies, the user will be required to "validate the device" using a one-time code delivered via SMS. Previously, I handled this by running the `xactfetch` script with a headed browser, manually entering the verification code when the prompt came up. Then, I would copy the `cookies.json` file, now containing a cookie indicating the device had been verified, to the Kubernetes volume, where it would be used by the production pod. Now that `xactfetch` uses asyncio, it is possible for the Chase `login` method to wait for one of multiple conditions: either login succeeds, or SMS 2FA is required. In the case of the latter, we can get the 2FA code from the secret server and enter it into the form to complete the login process. The real magic here is how we're getting the 2FA code from the SMS message. There are two components to this. First, I've installed [SMS to URL Forwarder][0] on my phone. This app does what it says on the tin: it relays SMS messages to an HTTP(S) server. I have configured it to forward messages from the Chase SMS 2FA short code to an _ntfy_ topic. The second component is the `chase2fa` script, which is called by the secret server. This script listens for notifications on the _ntfy_ topic where the SMS messages are forwarded. When a message arrives, it extracts the verification code using a simple regular expression that identifies a several-digit number. With all these pieces in place, the `xactfetch` script is no longer thwarted by the SMS 2FA barrier! [0]: https://github.com/bogkonstantin/android_income_sms_gateway_webhook	2024-07-11 21:21:03 -05:00
Dustin	43aba0c848	Switch to async API Using the Playwrigt async API is the only way to wait for one of multiple conditions. We will need this capability in order to detect certain abnormal conditions, such as spurious 2FA auth or interstitial ads.	2024-07-10 14:54:23 -05:00
Dustin	b30b38f76f	secretsocket: Handle secrets via external process `xactfetch` has three different ways of reading secret values: * From environment variables * By reading the contents of a file (specified by environment variables) * By looking them up in the Bitwarden vault This is very cumbersome to work with, especially when trying to troubleshoot using the container image locally. To make this easier, I've factored out all secret lookup functionality into a separate process. This process listens on a UNIX socket and implements a very simple secret lookup protocol. The client (`xactfetch` itself in this case) sends a string key, identifying the secret it wants to look up, terminated by a single line feed character. The `secretsocket` server looks up the secret associated with that key, using the method defined in a TOML configuration file. There are four supported methods: * Environment variables * External programs * File contents * Static strings The value returned by the corresponding method is then sent back to the client via the socket connection, again as a string terminated with a line feed. Moving the secret handling into a separate process simplifies the environment configuration needed in order to run `xactfetch`. Notably, when running it in a container, only the `secretsocket` soket needs to be mounted into the container. Since `rbw` is executed by the server process now, rather than `xactfetch` directly, the vault does not need to be present in the `xactfetch` container. Indeed, none of the secret values need to be present in the container.	2024-07-10 14:54:23 -05:00
Dustin	72eae4d5b3	Add CLI argument for selecting banks dustin/xactfetch/pipeline/head This commit looks good Details When debugging a failure for one bank's website, I often want to run the fetch for just that bank. To date, I've been commenting out the other bank, but that is silly. Now, `xactfetch` can target a subset of banks by specifying their name slug(s) as CLI arguments.	2024-07-07 18:24:33 -05:00
Dustin	dbc8be8e82	ci: Add Jenkins pipeline dustin/xactfetch/pipeline/head This commit looks good Details	2024-05-17 10:14:57 -05:00
Dustin	4120804dc4	chase: Fix login form fill Chase loves to make subtle, invisible changes to their website, presumably to break screen scrapers like this.	2024-05-17 10:10:41 -05:00
Dustin	2890597673	chase: Update card button text Chase changed the name of my credit card from CREDIT CARD to Amazon Visa. Just in case they change it again or something, let's match only on the card number.	2024-03-24 10:58:42 -05:00
Dustin	dd0edc599e	chase: Ensure transaction list is not empty By default, the transaction list for the Chase credit card shows transactions that have posted since the last statement. This list can sometimes be empty, particularly on the day the the statement is issued. When this is the case, clicking the _Download Account Activity_ button does not work; it simply displays a message stating "There's no account activity showing to download." Since we are going to adjust the date range on the download form anyway, it doesn't matter what's showing, we just need the button to work. Thus, we now set the page show all transactions and then click the button.	2024-03-11 13:26:48 -05:00
Dustin	31dcec331e	Make URLs, etc. configurable	2023-12-12 08:09:41 -06:00
Dustin	a984d643a7	Add example systemd units	2023-12-12 08:09:41 -06:00
Dustin	123d8c8630	Add Containerfile	2023-12-12 08:09:41 -06:00
Dustin	082a5fa4f9	meta: Relax Playwright dependency version Playright needs to be updated frequently in order to update its Firefox build. The Chase website has a very strict browser support policy, and frequently drops support for old Firefox versions.	2023-12-12 08:09:40 -06:00
Dustin	6999bd4ac5	Remove unlock ntfy message I've moved the bank website credentials to a shared collection in Bitwarden and made them accessible to an account dedicated to `xactfetch`. Using the `pinentry-stub` script, `rbw` can now auto-unlock the vault, using the password in the file referred to by the `PINENTRY_PASSWORD_FILE` environment variable. This means that `xactfetch` can now run completely automatically, without any input from me.	2023-12-12 08:09:40 -06:00
Dustin	dd3f12dfa4	Do not send ntfy messages when debugging While debugging `xactfetch`, I do not need it to send me notifications about failures, etc., since I am sitting at my computer. To suppress them, I can now set the `DEBUG_NTFY` environment variable to `0`.	2023-12-12 08:09:40 -06:00
Dustin	7e8fae14e6	Improve handling of backdated transactions Sometimes transactions show up in the export with the previous day's date. When this happens, these transactions may get skipped, since they might have the same date as the most recent transaction in Firefly. To help avoid skipping transactions, we need the start date to be the same as the most recent transaction, rather than the next day. This can cause duplicate imports, though, but fortunately, the Firefly Data Importer handles this fairly well.	2023-12-12 08:09:40 -06:00
Dustin	6091666471	Check latest transaction before logging in If the latest transaction was recent enough to skip importing transactions, we don't even need to log in to the bank websites. Thus, we should delay the login step until after we've checked this.	2023-12-12 08:09:40 -06:00
Dustin	22a5c6972e	meta: Fix version generation setuptools_scm is not used unless a `tool.setuptools_scm` table is present in `pyproject.toml`.	2023-12-12 08:09:40 -06:00
Dustin	ddee93c8e4	Import CSV files via HTTP importer Since I ulimately want to run `xactfetch` in Kubernetes, running the importer in a container as a child process doesn't make much sense. While running `podman` in a Kubernetes container is possible, getting it to work is non trivial. Rather than go through all that effort, I think it makes more sense to just use HTTP to communicate with the importer I already have running. I had originally chosen not to use the web importer because of how I have it configured to use Authelia for authentication. The importer itself does not have any authentication beyond the "secret" parameter (which is not secret at all, given that it is passed in the query string and thus visible to anyone and stored in access logs), so I was hesitant to add an access control rule to bypass authentication for the `/autoupload` path. Fortunately, I discovered that Authelia will use the value of the `Proxy-Authorization` header to authenticate the request without redirecting to the login screen. With just a couple of lines in the Ingress configuration, I got it to work using the regular `Authorization` header as well: ```yaml kind: Ingress metadata: annotations: nginx.ingress.kubernetes.io/auth-snippet: \| proxy_set_header Proxy-Authorization $http_authorization; proxy_set_header X-Forwarded-Method $request_method; nginx.ingress.kubernetes.io/configuration-snippet: \| proxy_set_header Authorization ""; ```	2023-12-12 08:09:40 -06:00
Dustin	ca8bff8fc5	chase: Handle both CSV schemata Apparently, Chase has switched back to the CSV schema without the Card column at the beginning. Just in case they decide to flip-flop on that field forever, we better try to handle both cases.	2023-11-04 17:43:55 -05:00
Dustin	45b9e64ec1	chase: Update CSV mapping Chase added a new "Card" field to the beginning of each record in their CSV exports.	2023-10-09 10:03:41 -05:00
Dustin	5ea5d09b30	chase: Improve navigation robustness Chase likes to subtly change their website fairly regularly, usually by introducing more ads or changing the location of existing widgets.	2023-10-08 09:50:15 -05:00
Dustin	d1c947c549	chase: Fixes for site updates Chase made some minor updates to their site recently which affected some of the element locators. The propaganda in the right-hand column of the landing page has changed, and the Downlod Account Activity form is still really terrible, and now behaves even more strangely.	2023-07-24 11:17:50 -05:00
Dustin	9831fa818f	commerce: Handle interstitial ads Commerce likes to occasionally inject ads and other propaganda after the login page, before loading the account summary page. To handle this, we may need to specifically navigate to the account summary page after logging in.	2023-07-24 11:15:52 -05:00
Dustin	805aa40e20	commerce: Use modal form to download CSV The Commerce Bank website no longer allows navigating directly to `Download.ashx`; doing so just returns a generic "we're sorry" error. They appear to have added some CSRF protection or something that makes this not work. As a result, we have to go fill out the form on the Download Transactions modal dialog in order to get the download to work correctly.	2023-07-13 18:01:34 -05:00
Dustin	3b432fc7d6	ntfy: Handle non-ASCII characters in message In order to set the message for a notification with an attachment, the text must be specified in the `Message` request header. Unfortunately, HTTP header values are limited to the Latin-1 character set, so Unicode characters cannot be included. As of ntfy 2.4.0, however, the server can decode base64-encoded headers using the RFC 2047 scheme. To maintain compatibility with older ntfy servers, the `ntfy` function will only encode message contents this way if the string cannoto be encoded as ASCII.	2023-06-23 09:33:24 -05:00
Dustin	43bf08eae8	chase: Remove second page load wait When there are multiple accounts associated with a Chase online banking user, the dashboard page layout changes. Detailed account history is no longer shown, so the elements we were waiting for in the "Waiting for page to load completely" step never appear. Since we're navigating directly to the download account transactions page now, anyway, we do not even need to wait for this button to appear.	2023-05-22 17:22:56 -05:00
Dustin	55d5f6bd1a	Include error with ntfy failure messages Although it is undocumented, ntfy accepts a `Message` header along with a file upload, which sets the message content of the notification when a file is attached. Since HTTP headers cannot contain multiple lines, the newline character has to be escaped. The ntfy server performs unescaping automatically.	2023-05-22 09:08:27 -05:00
Dustin	58f417aea9	chase: Go directly to the download page When there are no transactions in the default display, the Download account activity button is disabled. To avoid failing in this case, we now navigate directly to the download page. This requires explicitly selecting the credit card account from the dropdown list, as it is not pre-filled when the page is loaded directly.	2023-05-11 22:55:42 -05:00
Dustin	7cab766c38	Refactor error handling The `ntfyerror` context manager replaces `screenshot_failure` for handling online banking interaction failures. It has several advantages, notably: * takes a screenshot of the browser page before logging out * cleaner suppression of exceptions, with success tracking * sends an `ntfy` message, with the screenshot attached	2023-05-11 22:52:35 -05:00
Dustin	7683ff5760	Initial commit	2023-05-09 14:08:23 -05:00

38 Commits (master) All Branches Search

38 Commits (master)

All Branches