OneNote 2007 – The HTML Importer
Most recently I’ve worked on an HTML importer for OneNote. Originally, I had plans to build a Firefox extension that would allow right clicking on a web page and selecting a “Send to OneNote” option. After doing some initial research I decided that was going to be more effort than I could spare right now. Instead I compromised and built a tool that would work with both Firefox and IE, but the downside is that it is a two step process.
The code I’ve written will be easily adaptable to a Firefox extension if I ever get around to that part of the project. I believe the key will be to take the save complete pages code from chrome://browser/content/contentAreaUtils.js in the core of Firefox and adapt it to the needs of the extension.
Note: In order to use this application you will need to copy JPHOneNoteUtilities.dll into C:\Windows\assembly, or ensure the file is in the same directory as OneNoteHTMLImporter.
Let’s look at how OneNoteHTMLImporter.cs works.
OneNote has limited support for HTML pages, but it seems to understand some formatting directives from styles sheets. (Does anyone know what the HTML specifications are for OneNote?) My objective was to get all of the content from a web page onto the local computer. This will keep you from being tied to external web sites in order to render the page correctly. It will also give you a local copy of the information in case the page ever goes away.
In order to do this, you need to use the “Save Pages As; Web Page, complete” option in your browser. On the top browser menu, Click ‘File’, and select the “Save Pages As” option.
This will bring up a dialog box, which prompts you for where to save the web page. Once you’ve selected the location for the files, you need to make sure the “Save as type:” is set to “Web Page, complete”. Then just click the save button.
Now you can double the OneNote HTML Importer application. It will present you with a dialog to pick the file you just saved from the browser. Note: You can include a directory path in the shortcut and this will be the default place it looks for files.
In the dialog box you will see the filename that you save, and also a directory. You want to select the .htm or .html file that you saved. You can ignore the directory, this contains all of the additional files (images, style sheets, etc.) that were found in the web page; they will automatically be handled by the HTML importer.
Click the image to the left to see a larger version of an imported web page. You can view the actual page here.
When the HTML Importer runs it will create a directory in OneNote’s Default Notebook Location called HTML File Storage. If you aren’t sure where this is in OneNote you can go to Tools -> Options -> Save. In this location a unique sub-directory (the name comes from a call to System.Guid.NewGuid()) is created for those files. The files are then moved from their saved location to this one. The main html file is parsed and modified so all of the links point to the files in their new location. It also parses the HTML page for the <title> tag, and uses that as the title for the page. The HTML page is then inserted and embedded into a OneNote page.
The one down side to maintaining the files outside of OneNote is that they are not tied to the OneNote page that is created. So if you delete the page from OneNote the directory with the extra files will stick around.
OneNote takes all of the external information it can use and embeds that into the page. The only reason to maintain an external copy of the information is so that you can render the page in a web browser.
Maybe the thing to do would be to add a checkbox to open dialog box that would allow you to pick whether or not you wanted the page to be accessible outside of OneNote.
This software is distributed on an “AS IS” basis, without warranties or conditions of any kind, either express or implied.