As web scraping is becoming easier to use, more and more people are able to leverage the world’s web resources. As this trend grows, structured data from the web empower businesses and enable a wave of new business ideas to become a reality. Now there is a new technology on the market called: “self-contained agents” that might just make this a tsunami!
Self-contained agents are web scraping agents that are compiled and packaged with the web scraping software runtime so they are standalone.
When Content Grabber launched early this year, it included the ability to create scraping agents and export them as self-contained agents. This gives developers the ability to build self-contained web scraping agents which they can run independently and royalty free from the licensed software. If they buy the Content Grabber Premium Edition license, they can also white label these as their own.
All of a sudden, that agent you developed to solve a problem at work, can now be packaged as a self-contained agent and used to create new revenue streams for your business.
Work out an agent
How to create self-contained agents
Both the Professional and Premium Edition licenses of Content Grabber allow users to create self-contained agents. From the main Menu, simply choose File ->Export Agent and select the option Create Self-Contained agent in the Export Agent Window.
Before clicking Export to create the self-contained agent, you can click Customize Design to customize the agent UI components and text being displayed.
Customize agent GUI
A self-contained agent includes a user interface that allows users to configure and run the agent. Users can control some of the text and images displayed on this user interface. Also, one may control which agent options the future user can configure.
The Customize Self-Contained Agent screen has three tabs, with which one can specify the text and images that will be used on the user interface (UI) of the self-contained agent.
The Templates tab also allows you to specify custom template files. (The standard configuration screens of a self-contained agent include Content Grabber promotion). Set your custom HTML display templates to add your own designs to the screens. Thus you white label your self-contained agent.
Input data
A self-contained agent can use any input data that is supplied by a database or CSV file. An agent can also have multiple public data providers, and each data provider can be selected from a drop-down box as shown below.
Output data to file formats only
A self-contained agent is limited with export to the only file formats: Excel, CSV or XML. It cannot export to databases or execute a data export script.
If you need to export data to a database or use an export script, you can build you own custom application using the Content Grabber runtime, or run your agents using the Content Grabber command-line tool. Both the runtime and the command-line tool can be distributed royalty free if you own a Content Grabber Premium Edition license.
Upgrading a Self-Contained Agent
Users can upgrade an agent if the target website of a self-contained agent has changed and one needs to make changes to the agent in order for it to work correctly. Simply make the changes to the existing agent within Content and export the agent again. The end-user can overwrite their existing self-contained agent with the new agent. Any configuration changes the end-user has made to the agent will not be lost.
Drawback
The drawback of stand-alone agents is that each agent will be of a minimum size of approx 4MB no matter the complexity of an agent, because that’s the minimum size of the Content Grabber runtime environment included into it.
The self-contained agents can run without Content Grabber installed, but they do depend on Interner Explorer and .NET 4+, which will be pre-installed on the most newer versions of Windows.
Summary
Content Grabber’s self-contained agents are a very exciting new tool for developers and business entrepreneurs. They can be easily packaged as standalone tools and branded as your own. They’re relatively light weight and portable online or by email – the packaged Content Grabber runtime is usually less than 5Mb in size. Also, as they are an executable, there is also no installation process required.
Self-contained agents have now created a new reason for developers to use web scraping software over cloud-based solutions. It will be interesting to see how this takes off.