Overview
A complete Microsoft Visual C++ (version 6.0) project is included as an example of how to write an HTML filtering plugin .dll that will interface with AdCruncher™ Proxy, but your HTML filtering plugin .dll can really be written in any language you want. (It's just a Windows .DLL afterall).
The basic concept is pretty straightforward: AdCruncher™ calls into the .dll passing it a pointer to the HTMl code and the .dll returns back to AdCruncher™ a pointer to the modified HTML code. A return code indicates whether the plugin modified the HTML code and tells AdCruncher™ whether or not it should use the modified HTML code or to continue using the original HTML code.
The pointer to the HTML code that is passed to the plugin is a simple char pointer -- i.e. a pointer to a null terminated array of type 'char' (i.e. a standard 'C' string). Same with the pointer the plugin .dll returns back.
The string that AdCruncher™ passes to the plugin .dll should not be freed. AdCruncher™ will do that whenever the plugin .dll returns back to it (since it was AdCruncher™ that did the 'new').
If the plugin .dll decides to pass back new HTML code (as opposed to simply modifying 'in place' the string of HTML code that AdCruncher™ passes to it -- see the provided sample), it is responsible for doing the 'new' (or 'malloc') and AdCruncher™ will do the 'delete' (or 'free') when it's done with it. You don't have to worry about freeing the string of HTML code that you pass back. AdCruncher™ does that for you.
There are only two entry points required (beyond the ones required by Windows): "SetupFilterHTML" and "FilterHTML".
The "SetupFilterHTML" entry point is called once (with the "bInitialize" field set to 'true') whenever a proxy first starts and then again (with the "bInitialize" field set to 'false') every time the user clicks on the Plugin Properties toolbar button.
This entry point is used to provide 'persistence' for the plugin's configuration information. This allows the plugin developer to present the user with their own configuration dialog and have the configuration information the user enters for that plugin to be saved along with the rest of the given proxy's properties.
Thus, even though the same plugin may be used for every proxy, each proxy will have its own unique plugin configuration since each proxy has its own unique configuration saved in its own unique .apf file.
The other entry point -- the "FilterHTML" entry point -- is called just after AdCruncher™ receives an HTML web page from the remote server and just before it passes it back to the browser.
Both entry points are passed a single argument consisting of a pointer to a structure that contains all of the parameters needed by that particular entry point and both entry points return an integer as a return code.
For the "SetupFilterHTML" entry point, a return code of 0 (zero, i.e. false) indicates no change to the plugin configuration and a return code of !0 (not zero, i.e. true) indicates the plugin's configuration was changed and thus the new information should be re-saved.
For the "FilterHTML" entry point, a return code of 0 (zero, i.e. false) indicates "do not filter" and a return code of !0 (not zero, i.e. true) indicates "yes, please DO filter". Returning a non-zero return code does NOT mean you have to return HTML code, however. It simply tells AdCruncher™ to NOT use the original HTML code and instead use the plugin's modified HTML code that it returns back to AdCruncher™.
The HTML code you return to AdCruncher™ can be the same HTML code that was passed to your plugin or it can be brand new HTML code that your plugin creates on its own. The value of the "pszOutHTML" field determines which it is.
If "pszOutHTML" is NULL, AdCruncher™ assumes your plugin has modified 'in place' the HTML code it was passed and will use it instead of the original copy it saved from the remote server. If "pszOutHTML" is not NULL, then it must point to a valid 'C' string of HTML code that AdCruncher™ will pass back to the browser. Once the data has been passed back to the browser, AdCruncher™ will then automatically "free" this string so you don't have to worry about memory leaks.
Interface
As mentioned, there are only two required 'C' style entry points -- "SetupFilterHTML" and "FilterHTML" -- defined as follows:
// Copyright (c) 2001-2002, Software Development Laboratories, "Fish" (David B. Trout) ////////////////////////////////////////////////////////////////////////////////////////// // // EXPORTED FUNCTION DEFINITIONS // ------------------------------- // Copyright (c) 2001, 2002, "Fish" (David B. Trout) // // These functions provide a 'C' language interface and can be // called from any type of app that can access a DLL: VB, C/C++, // PowerBuilder, etc. // ////////////////////////////////////////////////////////////////////////////////////////// // // Change History: // // 07/31/01 2.5.4 Added debug logging function. // 08/01/01 1.1.0 Track version number separately from main product. // 08/09/02 1.2.1 New "bInitialize" SETUPFILTER boolean parm to indicate to // the filter to initialize itself without displaying dialog. // ////////////////////////////////////////////////////////////////////////////////////////// #pragma once ////////////////////////////////////////////////////////////////////////////////////////// typedef void (CALLBACK* PLOGDEBUGFUNC)(void*,const char*); // funct to log debug msg ////////////////////////////////////////////////////////////////////////////////////////// typedef struct tagSETUPFILTERHTML // modify the filtering parameters { size_t cbSize; // [in] size of this structure void* pInParms; // [in] ptr to original parm bytes size_t cbInSize; // [in] number of input parm bytes void* pOutParms; // [out] ptr to replacement parm bytes size_t cbOutSize; // [out] number of replacement parm bytes PLOGDEBUGFUNC pLogDebug; // [in] ptr to function to log debug messages void* pContext; // [in] debug logging context value int bInitialize; // [in] true = init parms; false = modify parms } SETUPFILTERHTML; ////////////////////////////////////////////////////////////////////////////////////////// typedef struct tagFILTERHTML // do the actual filtering { size_t cbSize; // [in] size of this structure char* pszURI; // [in] ptr to URI char* pszInHTML; // [in] ptr to original HTML code char* pszOutHTML; // [out] ptr to replacement HTML code PLOGDEBUGFUNC pLogDebug; // [in] ptr to function to log debug messages void* pContext; // [in] debug logging context value } FILTERHTML; ////////////////////////////////////////////////////////////////////////////////////////// typedef int (CALLBACK* PSETUPFILTERHTMLFUNC)(SETUPFILTERHTML*); // modify filtering parms typedef int (CALLBACK* PFILTERHTMLFUNC)(FILTERHTML*); // do the actual filtering ////////////////////////////////////////////////////////////////////////////////////////// extern "C" { int WINAPI EXPORT SetupFilterHTML(SETUPFILTERHTML*); // modify filtering parms int WINAPI EXPORT FilterHTML(FILTERHTML*); // do the actual filtering } //////////////////////////////////////////////////////////////////////////////////////////
The first thing that needs to be mentioned (but what should be obvious) is that the "cbSize" field in both structures passed to your plugin should match the actual size of the structure itself. If it doesn't, it means the structure has changed (i.e. that there have been new fields added to the structure and your plugin needs to be recompiled to use the new structure). This is pretty much a standard Windows thing.The next field, "pszURI", is a pointer to the URI (Uniform Resource Indicator -- really just a "more proper" name for a URL) of the web page this HTML code is for (i.e. where this HTML code came from). Your plugin can use this to decide whether or not it should modify the HTML code or not if it so desires. It is for informational purposes only right now.
The next two fields are the important ones: "pszInHTML" and "pszOutHTML".
The "pszInHTML" field points to a string of HTML code identical to the HTML code that AdCruncher™ received from the remote server. This is a copy of the HTML code that AdCruncher™ received from the remote server and not the original code itself. You can do whatever you want to this HTML code, but you MUST NOT FREE THE POINTER!; AdCruncher™ will do that for you. Same goes for the "pszOutHTML" pointer.
The "pszOutHTML" pointer is set to NULL upon entry to your plugin. If your plugin returns non-zero back to AdCruncher™, this field should point to the HTML code that AdCruncher™ should use instead of the HTML code pointed to by the original "pszInHTML" field. This field points to the replacement HTML code that your plugin wishes AdCruncher™ to pass back to the browser. Your plugin is responsible for allocating (via 'new' or 'malloc') the string that this field points to and AdCruncher™ is responsible for 'delete'ing (or 'free'ing) it.
If your plugin returns non-zero but does not provide a pointer to new replacement HTML code in this field (i.e. the field is still NULL upon return), then AdCruncher™ assumes "pszInHTML" points to the replacement HTML code (i.e. it assumes your plugin has modified 'in place' the HTML code pointed at by the "pszInHTML" field).
A return of zero means your plugin is NOT returning any new replacement HTML code and causes AdCruncher™ to completely ignore the contents of the "pszOutHTML" field. If you return zero back to AdCruncher™, it will NOT 'delete' (or 'free') the "pszOutHTML" field; it will ignore it completely. (And furthermore, it will discard the "pszInHtml" copy that it passed to your plugin and return to the browser the original copy of the HTML code as received from the remote server).
The debug logging function may be used to log whatever debugging information you desire to AdCruncher's debugging logfile. The first parameter MUST be the 'pContext' value that was passed to you, and the second parameter is simply a pointer to the character string you wish to be logged. AdCruncher™ does not do anything with this string other than to log it to the debugging logfile so you can 'malloc' it if you wish. Just be sure to 'free' it whenever the debug logging function returns back to you.
Summary
To recap, there are three possible returns back to AdCruncher:For case #1, AdCruncher™ does nothing (other than discard the pszURI and pszInHTML strings it passed to the plugin, which it always does anyway). This causes the browser to use the original HTML code as received from the remote server (i.e. no filtering takes place).
- Zero.
- Non-zero with "pszOutHTML" NULL.
- Non-zero with "pszOutHTML" non-NULL (i.e. pointing to something).
For case #2, AdCruncher™ passes back the "pszInHTML" HTML code back to the browser. (It's assumed that the plugin has modified it 'in place' since a non-zero return code indicates to AdCruncher™ to use the "replacement" HTML code and no replacement HTML code was provided since "pszOutHTML" doesn't point to anything).
And for case #3, AdCruncher™ passes back the "pszOutHTML" HTML code back to the browser and then 'delete's (or 'free's) the string afterwards.
That's it. It's that simple for right now. Future versions of this feature may build upon the foundation and provide new features and abilities, but for now that's all there is.If you do decide to write an original or handy-dandy filtering plugin that does something fancy, I'd appreciate it if you'd let me know about it. I'd like to know what kind of cool things my customers are doing with my product. :)
"Fish" (David B. Trout)
fishsoftdevlabs.com
"Programming today is a race between
software engineers striving to build bigger
and better idiot-proof programs, and the
Universe trying to produce bigger and better
idiots. So far, the Universe is winning."
- Rich Cook