Web Automation

Web automation refers to automating interactions with websites. This can be done for a variety of reasons, such as testing a website, scraping data from a website, or automating a task that would otherwise be tedious to do by hand. My exploration of web automation will explore JavaScript, Playwright, and AutoHotkey. I will discuss each method and the advantages and disadvantages of each, with a particular focus on detectability.

Detectability is important to discuss in the current age of web technology, where automation is fought more and more by detection scripts like reCaptcha v3 and Cloudflare bot detection, among others. For good reason, too. No website owner wants their website to be scraped by bots and have their work made irrelevant by other websites reposting their content or fed into large language models. There are, however, still moral, smaller-scale applications of automation, and having working methods is important.

JavaScript

One of the easiest ways to get into web automation is simply using JavaScript in the browser. This can be done using your browser's developer console or using a JavaScript-injection extension (Violentmonkey works on both Firefox and Chrome and is open-source.) This method covers the majority of use-cases.

As a very easy example, try clicking the button below and observe the result.

Now, try doing the same thing with JavaScript by entering the developer console (Ctrl+Shift+I on Chrome or Ctrl+Shift+K on Firefox), and type the following:

document.querySelector('#example-button').click()

Both clicking the button above and using JavaScript to click the button will result in an alert. However, the message is different when using JavaScript. This is because websites can easily distinguish whether an action was performed by a user or by JavaScript. The technical details for this detection are as follows: when a user performs an action, an Event object is generated and passed to event handlers. This object contains an immutable property set by the browser called isTrusted, which is true when the event was generated by a user and false when the event was generated by JavaScript. So a website (and bot protection script) need only check whether event.isTrusted === false to see if an action was performed by JavaScript.

Playwright/Puppeteer/Selenium

Playwright is a library used for creating and controlling browser instances (Chrome, Firefox, Edge or Safari) for the purpose of test automation. Other similar libraries include Puppeteer (which only supports Chrome) and Selenium (which has been around much longer). All of these libraries are similar and have the same fundamental advantages and disadvantages.

These libraries work by creating a browser instance and then allowing you to send commands to the browser. These commands include navigating to a URL, clicking an element, querying for information on a webpage, typing text, etc. One advantage over plain JavaScript is that the Event.isTrusted property is set to true for any events generated. This avoids that very trivial detection method.

There are, however, other ways to detect automation from these libraries. This includes a numerous list of JavaScript APIs that are modified in the browser instance, User-Agent strings, and the existence of extra global JavaScript variables created by the libraries for their functionality. For that reason, there are packages like playwright-stealth, puppeteer-extra-plugin-stealth and selenium-stealth, which attempt to hide these discrepancies from normal browsers. However, these packages are not perfect at hiding the numerous differences and are not necessarily kept up to date.

AutoHotkey

AutoHotkey is an automation scripting language for Windows capable of automating many operating system tasks like keyboard and mouse actions. It is very powerful and can be used for a variety of tasks, including web automation. All you need is to use a normal browser and create an AutoHotkey script capable of interacting with the web content through keyboard and mouse commands.

While AutoHotkey is not purpose-built for web automation, it's possible to query for the browser window position and size, take a screenshot, then use ImageSearch to look for coordinates of a particular part of the page. This gives you the ability to interact with elements on the page given how they look.

This is notably less convenient than using JavaScript or a purpose-built web automation library because you can't directly query for elements, but it has the advantage of not being nearly as detectable by the website. It uses the same browser as a normal user, and the Event.isTrusted value is true because as far as the browser is concerned, the user is performing the actions. The only way to detect this is to observe mouse movements or typing speed to determine if a bot is performing the actions, but this can also be mitigated by slightly randomizing movements and timing and making typing speed more realistic.

Conclusion

There are many ways to automate web tasks, and each has its own advantages and disadvantages. JavaScript is the easiest to get started with, but is also the most detectable. Playwright/Puppeteer/Selenium are more difficult to get started with, but are slightly less detectable, especially with precautions. AutoHotkey is the most difficult to get started with, but is the least detectable with only mouse and keyboard interactions to monitor.

In general, I would recommend JavaScript for most tasks like userscripts for improving website functionality while you're using it. For actual testing of your own websites or automation on websites that don't employ bot detection, Playwright/Puppeteer/Selenium are good options because of their convenience and programming language bindings, which are useful for actually processing the data from the automation. For automation on websites that employ bot detection, AutoHotkey is the best option because of its low detectability when the correct precautions are taken.