OneNote 2007 – The HTML Importer



Onenote HTML Importer

Most recently I’ve worked on an HTML importer for OneNote. Originally, I had plans to build a Firefox extension that would allow right clicking on a web page and selecting a “Send to OneNote” option. After doing some initial research I decided that was going to be more effort than I could spare right now. Instead I compromised and built a tool that would work with both Firefox and IE, but the downside is that it is a two step process.

The code I’ve written will be easily adaptable to a Firefox extension if I ever get around to that part of the project. I believe the key will be to take the save complete pages code from chrome://browser/content/contentAreaUtils.js in the core of Firefox and adapt it to the needs of the extension.

Note: In order to use this application you will need to copy JPHOneNoteUtilities.dll into C:\Windows\assembly, or ensure the file is in the same directory as OneNoteHTMLImporter.

Let’s look at how OneNoteHTMLImporter.cs works.

OneNote has limited support for HTML pages, but it seems to understand some formatting directives from styles sheets. (Does anyone know what the HTML specifications are for OneNote?) My objective was to get all of the content from a web page onto the local computer. This will keep you from being tied to external web sites in order to render the page correctly. It will also give you a local copy of the information in case the page ever goes away.

Browser File Menu
browser save dialog

In order to do this, you need to use the “Save Pages As; Web Page, complete” option in your browser. On the top browser menu, Click ‘File’, and select the “Save Pages As” option.

This will bring up a dialog box, which prompts you for where to save the web page. Once you’ve selected the location for the files, you need to make sure the “Save as type:” is set to “Web Page, complete”. Then just click the save button.

Onenote Importer Dialog

Now you can double the OneNote HTML Importer application. It will present you with a dialog to pick the file you just saved from the browser. Note: You can include a directory path in the shortcut and this will be the default place it looks for files.

In the dialog box you will see the filename that you save, and also a directory. You want to select the .htm or .html file that you saved. You can ignore the directory, this contains all of the additional files (images, style sheets, etc.) that were found in the web page; they will automatically be handled by the HTML importer.

icon

Click the image to the left to see a larger version of an imported web page. You can view the actual page here.

When the HTML Importer runs it will create a directory in OneNote’s Default Notebook Location called HTML File Storage. If you aren’t sure where this is in OneNote you can go to Tools -> Options -> Save. In this location a unique sub-directory (the name comes from a call to System.Guid.NewGuid()) is created for those files. The files are then moved from their saved location to this one. The main html file is parsed and modified so all of the links point to the files in their new location. It also parses the HTML page for the <title> tag, and uses that as the title for the page. The HTML page is then inserted and embedded into a OneNote page.

The one down side to maintaining the files outside of OneNote is that they are not tied to the OneNote page that is created. So if you delete the page from OneNote the directory with the extra files will stick around.

OneNote takes all of the external information it can use and embeds that into the page. The only reason to maintain an external copy of the information is so that you can render the page in a web browser.

Maybe the thing to do would be to add a checkbox to open dialog box that would allow you to pick whether or not you wanted the page to be accessible outside of OneNote.

This software is distributed on an “AS IS” basis, without warranties or conditions of any kind, either express or implied.

 



17 responses to “OneNote 2007 – The HTML Importer”

  1. AdminID says:

    “Clip to OneNote” is a FF Extension that will Send To. It’s available on http://www.OneNotePowerToys.com

  2. Mitchke says:

    hi there! just what I was looking for… first steps go allright, but then I get erormessage:
    Error I_._._
    Directory move failed: D:\tramdsm\bureaublad\onenote-2007-the-importer-files
    -> D:\tramdsm\tekst\OneNote Notebooks\HTML File
    Storage\eal bdf28-4baO-48ab-b729-38fc4de54edS\onenote-2007-the-importer_fil
    es
    when I click ok the page text gets inserted -but without the pics of course. Any idea?
    the onenote importer doesn’t run at all, so maybe it’s just something I did wrong with the placement of the dll? (couldn’t get it to be placed in \assembly)
    thanks for your work!

  3. Jamie says:

    I have uploaded new versions of the program to see if I can get some more information about the error you are running into.
    Re-download the following files:
    http://stratusnine.com/cgi-bin/download.cgi/JPHOneNoteUtilities.dll
    http://stratusnine.com/cgi-bin/download.cgi/OneNoteHTMLImporter.exe
    Try the HTML Importer again, and send me any error output you get.
    As for the OneNoteImporter, it is a no frills program and does not produce any screen output. It is important that you set the directory for it to look for files in either by modifying the shortcut that is calling it, or by supplying the directory on the command line:
    onenoteimporter c:\place\to\find\dirs
    Another important thing to note is in the location you specify (c:\place\to\find\dirs from the example above) is expecting only sub-directories to be here. Files get ignored at the top level. However, each sub-directory is processed individually and should become an Unfiled OneNote page.
    I know these programs are rough around the edges — no-one ever showed interest before, so I never took the time to improve them …
    –Jamie

  4. Jason says:

    Hello, I just wanted to send my thanks for the HTML Importer for OneNote. It works great for all of the web pages I have tried so far. The appearance of pages imported to onenote is much better than using the “clip to onenote” ff extension and retain their links and options that are not available if you simply print a webpage to onenote. Thanks!!

  5. niki kircher says:

    Thank you for developing this. I have been looking for something like this. I am having problems getting it to work but I suspect that it is because I am running the older version of onenote — will this work with that version or do I need to upgrade.

  6. I have just run across this item and am very intrigued by it. The problem is that when I run the installer it crashes. No error message or anything, just that the program crashes. I am running Vista Ultimate 64 bit and OneNote 2007 and have complete administrators privileges. I’d love to work with this add-in so if you have any suggestions they would be greatly appreciate.

  7. Jamie says:

    @Terry Allan Bennett
    I’ve seen that happen based on a couple of different conditions.
    1. The JPHOneNoteUtilities.dll is not located in the same directory as OneNoteHTMLImporter.exe.
    2. The program is being run from a network drive or some place other than the “My Computer” security zone.
    I hope this helps.

  8. Jamie says:

    @niki kircher
    These utilities only work with Onenote 2007 RTM. The API was not available for Onenote 2003. Also, the API or more specifically the XML namespace changed between 2007 beta and 2007 RTM so these utilities do not work with the 2007 beta either.

  9. Robert says:

    i downloaded the newest version, and try to import website from Opera “save as” – results in empty notebook, and from FF – results in this:
    http://62.69.200.53/blad_importer.png
    can you help me?

  10. Rainer says:

    It looks like you have written exactly the little piece of software that I need. Unfortunately I am using a Germen version of Vista which results in a subdirectory called ‘xy-Dateien’ instead of ‘xy-Files’ and so it seems that the application does not find the path. Is it possible to pass the language along with a command line parameter that would switch to the correct extension of the subdirectory?
    Thanks in advance!

  11. Jamie says:

    @Rainer: I have added a command line parameter to allow for specifying the browser directory extension. This was much simpler than adding support for localization. To use it just add “-e ” to the command line. Specifically for your case it will be “-e Dateien”.
    You can download the new exe from here.

  12. Robb says:

    I am using vista 64 and testing the beta 2010, will it work with these?
    Thank you for your work!

  13. Jamie says:

    When I wrote these for 2007 the XML schema changed between the beta and RTM. As such, I do not expect these to work with 2010 out of the box. I haven’t been following the beta this time around, but I do plan to upgrade when the product is RTM. I’ll probably look at making these work with 2010 at that point.

  14. Chris Cogan says:

    Some notes:
    Windows 7 won’t allow me to copy JPHOneNoteUtilities.dll into the assembly subfolder of the windows folder (the Windows Explorer right-click menu doesn’t even show a Paste option).
    And, when I try to run OneNoteHTMLImporter.exe, I get a message saying that it has stopped working and that Windows is “checking for a solution to the problem”, after which there is a pause, and then the message disappears.
    I understand that your stuff may have been written for Win XP (etc.), but I thought you’d like to know of these problems in case you decide to update the importer.
    Do you know if the API will be less horrifying in OneNote 2010 than it is in 2007? (The documentation for the OneNote 2007 API killed almost all thoughts I had of trying to do anything useful with OneNote via the API.)

  15. krusader23 says:

    Hi,
    I’ve downloaded the plugin and I get the same error as in the first response.
    I’ve attached a small *.zip file with the log and all the files that Importer.exe created on my hdd.
    If you’ve got some time, please take a look at them.
    http://www.4shared.com/file/5WAaKjpu/OneNote_importer_plugin-import.html
    Regards.

  16. John says:

    Hi,
    This is a great utility, thank you for taking the time to develop it.
    When I try to install it, I get the standard MS grey box saying “OneNoteHTMLImporter has encountered a problem and needs to close… etc, etc”
    I installed dotnet 1.1 2+ and 3.5 hoping that it was a dotnet related problem. The DLL is in the same directory as OneNoteHTMLImporter.
    Any suggestions as to what may be causing the problem ? (running XP Pro SP2)
    Thank you,
    John.

  17. John says:

    Additional info for previous message (also by me, of course)
    I looked into the “dump” that is available from the MS “grey box” and found that the problem is caused by a system.io.filenotfoundexception. I tried to figure out what file it was looking for but couldn’t.
    Any ideas as to what the missing file may be ? I’ve copied the DLL into the c:\windows\assembly as well has having it present in the same directory as OnenoteHTMLImporter.exe. Could it be looking for some file that is part of Office or Onenote ?
    thanks,
    John.