K  

Filters for KOffice

Office
 
Information
KDE Home
News
FAQs
Getting Involved
 
Installing KOffice
Sources
Binaries
Mirrors
 
KOffice Components
KWord
KSpread
KPresenter
KIllustrator
Krayon
Katabase
KFormula
KChart
Kivio
Filters
 
Contact Information
Developers
Bug Reporting
Mailing Lists
 
Supporting KOffice
Supporting KOffice
Supporters of KOffice
1. Do we really need filters?
2. Which filters are there?
3. Which filters are most wanted?
4. How to use a filter?
5. Developing Filters for KOffice
5.1. Prepare the Environment
5.2. Behind the Scenes
5.3. How to Develop a Filter?
5.4. Remaining Questions?
5.5. File Formats - Doctype Definitions
5.6. Special Filters
5.7. Add Documentation
5.8. Links for Developers



Go to the top of this page   1. Do we really need filters?

In my opinion we definitely need filters because a important factor influencing the success of an office suite is the ability to import and export documents. Of course this is not critical stuff like printing or a nice and straightforward user interface, but it's not just a "nice-to-have" feature, either.

Just imagine a user working in a heterogenous environment using KOffice among other office suites. Sometimes it is necessary to exchange documents as we all know. Now the adventure begins:

  • Which format do I use (i.e. Which format is supported by both office suites)?
  • How much information is lost due to internal differences between the office suites (e.g. formatting, tables, pictures, columns,...)?
  • What about the character sets (i.e. the encoding of umlauts and so on)?
  • Can I use Unicode characters in the other office suite?
Another problem is that some vendors of proprietary office suites provide inaccurate and/or incomplete documentation of the file formats (or no information at all). This is one of the obstacles we face because it's really time consuming to search for information in a binary file as you can imagine... (At this place I'd like to thank Espen Sand for his brillant KHexEditor (should be working if you have install KDE2 with kdeutils).



Go to the top of this page   2. Which filters are there?

Well, at the moment all parts of KOffice support the filter architecture (read: the parts are able to use filters), but there are working filters only for some parts. These include:

kword
type
info
import
export
- ASCII i.e. plain text yes yes
- HTML   yes yes
- MS WinWord 97 97 + 2000 yes -
- DocBook   - yes
- Applix Word   beta -
- LaTEX   - beta
- RTF   bad -
- MIF   bad -
 
 
kspread
type
info
import
export
- CSV Comma Separated Values yes yes
- MS Excel 97 97 + 2000 yes -
 
 
kpresenter
type
info
import
export
- MS PowerPoint 97   bad -
 
 
killustrator
type
info
import
export
- MSOD MS Office Drawing beta -
- WMF Windows MetaFile beta -
 
 
Krayon
type
info
import
export
-   - -
 
 
kivio
type
info
import
export
-   - -

yes = It's a good working filter
beta = It's a first beta version of a filter - does something
bad = If's a first alpha version of a filter - does nothing good

If you are looking for some information about the actuall status of the different filters you only have to click yes/beta/bad text in the import/export column.

I can think of many filters we'd like to have, but sadly we are only a few developers.
Do you want to join us?
And do you want to help with develop one of the next or another filter.



Go to the top of this page   3. Which filters are most wanted?

Which filters would be nice for koffice to get a better place?!

At the moment I think the "most wanted" filter would be a RTF import and export filter for KWord as you can exchange RTF documents with nearly all office suites. Obviously there is more need for KWord and KSpread filters than for KPresenter filters.

Here is a first list:
  • KWord:
    • Rich Text Format
    • MIF (Adobe)
    • StarWriter (SUN)
    • WordPerfect (Corel)
    • Abiword (Gnome)
    • Word'95 (MS)
    • .....
  • KSpread:
    • StarCalc (SUN)
    • Gnumeric (Gnome)
    • ApplixSpread (ApplixWare)
    • xspread
    • Excel'95 (MS)
    • Lotus 123
    • Quattro Pro (Corel)
    • .....
  • KPresenter:
    • PowerPoint 97 (MS)
    • StarImpress (SUN)
    • Presentations (Corel)
    • PowerPoint'95 (MS)
    • ApplixGraphics (ApplixWare)
    • .....
  • KIllustrator:
    • StarDraw (SUN)
    • xfig
    • .....
  • Krayon:
    • .....
  • Kivio:
    • Dia (Gnome)
    • Visio (MS)
    • .....

MIF Info:
MIF files are an predecesser to XML, used only by Adobe Framemaker. It is a tag based, ascii only file. Since framemaker can use pictures it can also embed pictures into the file. Still in use and it has a new file-format in the new Framemaker 6 that is (according to Adobe) not downwards compatible (Frame 5.5 can not load Frame 6.0 mifs)

VERY good to have as an import/export filter for kword.




Go to the top of this page   4. How to use a filter?

The Koffice Library Developers have done a magnificent job and you will not even notice when you use a filter to convert a file to the part's native format. Ok, you can see it (debug output), but there is no difference for you at all. Just select
File -> Open... for import or
File -> Save or File -> Save As... for export
and choose the extension which should be used. Select (or name) the file and off you go :)





Go to the top of this page   5. Developing Filters for KOffice


Go to the top of this page   5.1. Prepare the Environment

As KOffice needs KDE 2 it's necessary to install at least parts of KDE 2 (Qt, kdesupport, kdelibs, kdebase - exactly in this order) and - of course - KOffice. I recommend looking for further information on how to install it. To get some help from real KOffice experts please join the KOffice mailing list (koffice@kde.org or koffice-devel@kde.org). There is an archive of those lists and you can find them all at http://lists.kde.org.
One final hint: Add -debug to your ./configure options for Qt and --enable-debug to your ./configure options for all KDE packages. The resulting binaries are quite large and a little bit slower, but nonetheless this is an enormous help if you are developing (=debugging :) something.
Oh, and use gdb-5.0 or later, because gdb-4.x always crashed on KOffice stuff (at least for me).





Go to the top of this page   5.2. Behind the Scenes

There are several ways for programming a filter depending on your needs. However, unless you really need a non-standard filtering method (e.g. because you'd like to import huge amounts of data and the performance is bad) we recommend using the plain and easy standard method. All the following descriptions are based on the assumption that you use the standard filtering method. The gory details about the optimized (read: hacky) methods of filtering are provided at the bottom of this page (Note: I didn't add a link, because you have to scroll the standard description, at least :).

KOffice uses a quite straightforward approach to convert files to the native format of the matching KOffice part. I'll try to explain this via a simple example (from the user's point of view) before providing further (more detailed) information:

  1. The user activates File -> Open...
  2. The file dialog pops up
  3. She/He selects a file extension (e.g. *.csv [cvs = Comma Separated Value])
  4. Now the file dialog shows only the matching files
  5. Depending on the filter, the user might see a configuration dialog. This (optional) dialog is "embedded" in the file dialog. It can be used to pass configuration options to the filter (i.e. a kind of replacement for commandline arguments)
  6. After selecting a file she/he presses Ok
  7. The filter converts the file to KSpread's native format (taking the configuartion options into account)
  8. KSpread opens a (native!) file
  9. The user is happy :)

When saving documents the filtering works nearly identical. The whole process of looking for available filters,... is done by the KOffice Libraries so you don't have to care about that. As you can see there is no magic involved and as we now know the basics let's have a look how this really works!

After the user clicked File -> Open... the part queries the KoFilterManager (koffice/lib/kofficecore/koFilterManager.cc) to prepare the file dialog.
The filter manager (=KoFilterManager) queries the trader for information on the supported filter types (e.g. HTML, TXT,...) and configuration dialogs. Then it prepares the KFD (K File Dialog) to show all this stuff. The file dialog pops up and now the user can select a specific filter (e.g. '*.csv'). After selecting a file (with the correct extension - the file dialog ensures that) the user might have to set some configuration options (Note: You don't have to provide such a dialog, it's just a nifty feature for some obscure filters which need passwords and so on (i.e. some kind of user input)!), clicks Ok and the filename is returned to the part. The part passes the filename to the filter manager. The filter manager checks whether the file is native or not. If if is native, the filter manager returns the name and the part opens it. If it is not, the filter manager returns another filename - this is the name of the converted file (a temporary file somewhere in /tmp). Finally the part opens the converted (=native) file.

But when does the real converting occur? Let's see... the filter manager gets the filename (and the mimetype of the calling part). It tries to find out the mimetype of the file which should be opened. Then the trader is queried if there is a filter which is able to handle those two mimetypes (the one of the file to convert and the native format of the application, e.g. HTML -> KWord). If such a filter is found the filter manager loads it and passes the information of the optional configuration dialog.

Filters are shared libs which are opened on demand (via KLibLoader, which is a wrapper for dlopen) and closed after a few minutes of inactivity (so that we don't waste too much memory). All the filters have to inherit KoFilter (koffice/lib/kofficecore/koFilter.h) and they have to override the pure virtual method filter(...) . This method is called by the filter manager (for details see below) and the filter starts to convert the file (i.e. opens the file, reads it, converts the contents, writes it).





Go to the top of this page   5.3. How to Develop a Filter?

Please have a look at koffice/filters/kword/ascii/asciiimport* if you want to write an import filter, at koffice/filters/kword/ascii/asciiexport* if you want to write an export filter. (and exchange "import" by "export" below).

To create a filter you have to:
  1. Copy the appropriate files from the koffice/filters/kword/ascii directory to a separate directory (at least Makefile.am, asciiimport*, and kword_ascii_import.desktop).
  2. Create a "factory" class for your filter. This factory is needed to load the filter (as we've heard, technically the filter is a shared library which is "dlopened" on demand). You only have to change all the names in 'asciiimport_factory.cc' and 'asciiimport_factory.h' (i.e. all the ASCII, Ascii, and ascii stuff has to be replaced by YOURFILTER , YourFilter, and yourfilter).
  3. Derive your "main-filter-class" from KoFilter. Look at 'asciiimport.cc' and 'asciiimport.h' how this is done. Don't forget that you have to include or link the moc file - otherwise the filter won't work! (the classname is needed during "incarnation," and all this information is stored in those nifty little moc files).
  4. Rename the files (ascii* -> yourfilter*) if you haven't done this, yet.
  5. Add your classes (i.e. the files containing the C++ headers/sources) which are needed to convert the file (for an example have a look at the export filter in koffice/filters/kword/html/). Remember: The filter is a "normal" library, so you may do whatever you can do with libraries :)
  6. Adjust the *.desktop file (Note: You'll have to rename the file or it won't work correctly because you'll overwrite something!). In the file you'll have to change the "Name", "Comment", "Export", "ExportDescription", "Import", "ImportDescription", "X-KDE-Library", and probably the "Icon" and "MiniIcon" fields.
  7. If KDE doesn't "know" your filetype up to now, please add a x-*.desktop file (see koffice/filters/kspread/csv). Make sure that it gets installed to the correct directory. If KDE knows your filetype, you don't have to care about that.
  8. Adapt the local Makefile.am by renaming/adding files to meet your needs.
  9. If your directory is located somewhere in the koffice/filters directory,
    1. just add your directory to the Makefile.am file of the parent directory.
    2. Execute make -f Makefile.cvs in the top directory (koffice/) (or use create_makefile(s) from kdesdk/scripts)
    3. and ./configure the whole stuff again. If it compiles you've won :) .
    4. Install (make install)
    5. and test it...

To create an optional configuration dialog you have to:

  1. Create a "factory" class for the dialog (like for the filter above). If you look for an example/template: koffice/filters/kspread/csv/csvfilterdia* is quite straightforward.
  2. Design and implement the real dialog. It has to inherit KoFilterDialog. Of course you have to override the pure virtual method state(). This method returns a QString which contains the configuration information. You might want to use some Qt geometry management magic to make the dialog look nice.
  3. The format of this string is up to you, but I suggest using a simple XML-like format (using QDom/QXML) because it's easy to debug.
  4. Create (or modify) the .desktop file (template from the CSV filter)
  5. Extend the Makefile.am file (see koffice/kspread/csv/Makefile.am for details)
    and re-run make -f Makefile.cvs
    and ./configure.
  6. Make sure that your dialog neither has any Ok and/or Cancel buttons, nor menubars/toolbars. Shortly put: Make it really simple.
  7. Note: A KOffice part may be launched from the commandline passing a filename. This file gets opened regardless if it is a native one or not. Of course then no file dialog gets opened and your dialog is not shown. You can detect this situation by testing the QString in the filter() method against QString::null. If you still want to have the dialog, you have to show it "manually." To give you some ideas: Fist you have to load the dialog's lib (via KLibLoader) and use the factory class to create a dialog (maybe you'll have to adapt the factory class to be able to do that). Then you'll have to create a real dialog (with an Ok button, at least) and embed your configuration dialog (i.e. the real dialog is the parent of your dialog widget). Then wait for the user to click Ok and use the dialog's status() method to read the information.

Wow - you wrote a filter! But what now? You don't have a CVS account and you want to check it in? No problem. Please send all the files to me, Werner Trobin, <trobin@kde.org>. Note: It's no problem whatsoever if you send your "drafts" - that's what CVS is good for!





Go to the top of this page   5.4. Remaining Questions?

Feel free to ask me if there are any remaining questions. BTW: It's generally a good idea to ask on koffice@kde.org whether anyone works on a filter before starting to implement it :)





Go to the top of this page   5.5. File Formats - Doctype Definitions

This section contains some useful documentation (I'll add more stuff here, soon):

  • KWord Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kword/dtd/kword.dtd

  • KSpread Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kspread/dtd/kspread.dtd

  • Killustrator Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/killustrator/kil.dtd

  • Kpresenter Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kpresenter/dtd/kpresenter.dtd

  • Krayon Doctype Definition
    Download:    WebCVS (current)   or    look in your local koffice copy in koffice/kimageshop/dtd/krayon.dtd

Here is a little advise (from IBM) how to read doctype descritions. If you look at that page it shouldn't be a problem to understand how to read them.
   Doctype Description



Go to the top of this page   5.6. Special Filters

As I already said above sometimes the standard filter interface is just a little bit to slow, complicated(?!?), annoying,... To prevent you from trying to circumvent our interface we've done it for you, already. It's not clean, it's not save, but it should at least be fast. Once again I'd like to warn you, but I'm sure you won't listen to me, anyways.

If you read the definiton of the KoFilter class, you can see that there are more pure virtual methods, than just filter(...):

  • virtual const bool I_filter(const QCString &file, const QCString &from, QDomDocument &doc, const QCString &to, const QString &config=QString::null);
    This method can only be used to import documents (hence the 'I_'). The difference to the standard method is that you don't write to a file, but to a QDomDocument. This document is opened directly from the KOffice application. The advantage is, that there are no temporary files involved; the disadvantage is, that the application has to support it. At the moment only KSpread supports this kind of import.
  • virtual const bool I_filter(const QCString &file, KoDocument *document, const QCString &from, const QCString &to, const QString &config=QString::null);
    This one is even more hacky than the above one (and it's also for importing documents). Here you get direct access to an empty KoDocument(!). Please use this one only in very rare cases (e.g. where you have to transfer enormous amounts of data). Another huge disadvantage of that method (besides being insecure) is, that you'll have to change the filter, if the interface of the part changes.
  • virtual const bool E_filter(const QCString &file, const KoDocument * const document, const QCString &from, const QCString &to, const QString &config=QString::null);
    Well - that's still not enough for you? This method delivers a KoDocument which is used by the application (i.e. it's "full," not empty like the above one). Of course this method is used to export documents. Did I already say that you should use this method only if this is the only way out?

If you need some inspiration how to implement these kinds of methods, just have a look at koffice/filters/kspread/csv. You surely already know, that you can even use a configuration dialog with these filtering methods, too.





Go to the top of this page   5.7. Add Documentation

So if you have done your filter please add some information.
At least add a statusfile status.html. Inside of that file you should insert

  • a feature list,
  • a history list,
  • a todo list,
  • maybe some nice links to more information (fileformat description, ...),
  • the author('s) of the filter with email-addresses,
  • last page update.

Whenever you obtain any information about the filter update the status.html file. The status file may contain the result of a finished investigation (e.g. can code be reused from another open source project if yes how, if no why not) or even the experience of a not finished investigation, and of course the possibilities the filter currently provides.

There is a statusfile template where you can look at. Statusfile template is here: temp  

If you have done your documentation and the tables are looking right mail it.

If you are doing some update don't forget to update your documentation too!





Go to the top of this page   5.8. Links for Developers

During all the hours on the net searching for information on file formats and stuff like that, I came across some very interesting homepages:

http://www.wotsit.org Information on various file formats
http://msdn.microsoft.com The MS Developer's Network Library
http://arturo.directmail.org/filtersweb/ -
http://snake.cs.tu-berlin.de:8081/~schwartz/pmh/index.html LAOLA: The famous LAOLA homepage
http://skynet.csn.ul.ie/~caolan wv-library: The wv-library from CaolanMcNamara (old link)
http://sourceforge.net/projects/wvware wv-library: The wv-library from Caolan McNamara (new link)
http://www.btinternet.com/~shaheedhaque/ Word 97: Information to Word Filters (Shaheed Haque)
http://xml.openoffice.org/ StarOffice: The future-format for StarOffice (open office) (XML)
http://www.corel.com/partners_developers/ds/wpsdks.htm WordPerfect, Presentations, Quattro Pro... Fileformat Description of Version 7.0
ftp://www.thekompany.com/pub/KOffice/filters/ Lotus API of WordPro
http://www-106.ibm.com/developerworks/library/buildappl/writedtd.html Doctype Description
http://www.openoffice.org/source/browse/sw/sw/source/filter/ StarOffice filters via webcvs:
sw6 (StarWord 6), ww8 (WinWord8), ascii, excel, html, lotus, rtf, xml, ...

Missing links WITH FILEFORMATS:
If you find another interesting one, please let me or us know that I/we can add it here.
Maintained by the KOffice Web team. Last modified Jan. 02, 2001.