I Am 'Totally' Human: Bypassing the reCaptcha

Automating tasks through the Internet has been of interest in recent years. By writing a script a user could automate the process of submitting a form to a webserver and prevent legitimate users from acquiring a scarce product on an e-Commerce site (concert tickets, clothing pieces), or in extreme cases submit large amount of requests to the server for offensive purposes. System administrators have tried to prevent automated scripts from submitting forms through a program primarily through the implementation of a “captcha”, a Completely Automated Public Turing Test to Tell Computers and Humans Apart.

Our task is to build a program that will load a given URL, populate a form with provided data, and (most importantly) complete the challenge to submit the form. Using an Audio Transcription API we will build a bot to retrieve an audio captcha challenge from the Google reCaptcha Widget, send the audio sample through the API and retrieve the answer to finally submit the form and masquerade as “totally human.” Although we wrote the code to perform these tasks, the technology used in our project is not of our own design: from the audio formatting library used to split the challenge sample, to the web browser controller used to automate a user's interaction with a site, and to the public API used to transcribe audio files; we utilize various technologies together to build our script that processes and passes a Google reCaptcha challenge. This same procedure could be carried out by another person reasonably adept at using a computer, indicating a potential for this project to be reused by other programmers.


2017 13th International Conference on Signal-Image Technology & Internet-Based Systems (SITIS), Jaipur, India.