Download the FourthPartyMobile project report here

What is FourthParty?

Fourthparty is an open-source platform that automates the measurement of dynamic web content (e.g. cookies and Javascript calls) by instrumenting Mozilla Firefox and runs on virtually every modern desktop operating system. The FourthParty codebase is at the core of FourthPartyMobile, so definitely check out their website! ;-)

What is FourthPartyMobile?

FourthPartyMobile is a modified version of FourthParty that supports Android-based mobile devices, such as smartphones and tablets. It is implemented in Java and Javascript, leveraging both the Android SDK and the Mozilla Add-On SDK. Persistent storage is fully compliant with FourthParty's SQLite database schema. Thus, we provide a standardized representation for traditional and mobile crawls, which facilitates data analysis.

Below is an UML diagram of the FourthParty database schema (click on it to enlarge):

Why FourthPartyMobile?

FourthPartyMobile was developed as part of a final project for the Fall 2012 offering of Arvind Narayanan's COS 597D - Advanced Topics in Computer Science: Privacy Technologies at Princeton University. We wished to automate the detection of third-party tracking mechanisms while browsing the web on a mobile device. To this end, we decided to adopt the FourthParty project’s approach and instrument a popular open-source mobile browser (i.e. Firefox Mobile) to be used as an enhanced web crawler. This enabled us to log realistic end-user interactions (e.g. execution of embedded scripts) as opposed to just downloading each web page’s static content, which is what traditional web crawlers do.

FourthPartyMobile's Architecture

Mobile application development poses a variety of challenges that need to be addressed for a mobile web crawler to be materialized:

  • Mobile devices have limited amounts of RAM, so applications should not rely on large data structures stored in main memory.
  • Security permissions in mobile devices are strict, which means that writing data into persistent memory is not always an option.
  • Processing power in mobile devices is limited, so computationally intensive procedures, such as parsing a web page, should be delegated to an external entity.
  • Mobile network bandwidth is a limited resource, so large data transfers should be avoided.
  • Battery life must be preserved as much as possible by a mobile application if it is being aimed towards the general public.

FourthPartyMobile's architecture delegates most of the computation and storage to a supporting server, limiting the mobile device’s responsibilities to fetching one website at a time and generating a log of its latest interactions (e.g. cookies, javascript, embedded HTTP objects). The crawling plugin running on the mobile device sends the interaction log corresponding to the website being visited in the form of SQL statements to the crawling backend running on a server. This way, the amount of state kept in the mobile device’s main memory is minimal and the crawl database, which can be several Megabytes in size, is generated by the supporting server's side.

start.txt · Last modified: 2013/01/16 19:08 (external edit) · []
Recent changes RSS feed Powered by PHP Valid XHTML 1.0 Valid CSS Driven by DokuWiki