Freitag, 15. Mai 2009

XML - Again...

Ok it seems my last blog post has triggered quite a reply.

However, I think there still seems to be a bit of confusion. Let's try again...

Why am I so into XML-based standards? Because I understand them.

Even if this was not your intention, this is a bit misleading as it suggests that I don't understand XML files and if I would, I would share your points of view.

I can assure you that I understand the principles of XML files very well. Hell, a lot of parts of simon even use XML files (take a look at how the commands are stored for example).

But at the moment it just doesn't make sense to use a custom PLS modification (adding terminal tags). We need the lexicon to be readable by julius and HTK. This implies that we have to store the lexicon in that format (or add support to Julius; HTK is essential closed-source so we would still have to keep a seperate HTK lexicon around). It essentially does not matter what I believe is the better format for the job.

So the only possible way to incorporate a XML based dictionary storage format would be to add an additional layer. This, however, means that the features supported can only be the smallest common denominator of both formats. So no fancy IPA (no support in HTK), no nice multiple-graphemes per word (HTK could be compared to 1NF if you are familiar with database normalization), etc. In the end this additional layer would bring nothing beneficial to the table because we can't use it's nice features as long as we have to keep HTK compliant too. All it would do is introduce another source for errors.

All these considerations are irrelevant when we take Julius and HTK out of the equation. Then, adopting and modifying PLS is not such a bad idea (altough I would like to store commands and the dictionary in the same file for the upcoming package-based structure). Removing the dependency on HTK is something I would like very much but it doesn't seem feasible right now and in the near future.

And, by the way: I don’t like to read SAMPA. I prefer the IPA when editing the pronouncing dictionary.

The HTK does not support UTF-8. However, I would prefer using the SAMPA even if it did. I find that it is much easier to read and learn the SAMPA (especially if you speak german). Also, I do prefer to be able to transcribe my words with the keyboard instead of using sign-tables to pick out the symbols.

As the IPA and X-Sampa can be converted to and from each other without loosing anything I don't really see a problem there.

Sometimes, I ask myself the question: why don’t they switch from SAMPA to the IPA? Why don’t they switch their homepage from ISO-8895-1 to UTF-8?

This somehow confused me a bit. Our homepage is UTF-8 encoded? So are all the files produced by simon (except where it is not possible because of third party products that don't support it)...

Then I saw that you linked to the SPHINX homepage and not to our homepage...

export functionality is a low priority feature

OK. From my point of view, Voxforge needs an export functionality.

This is especially confusing as I was talking about export functionality of simon. You talk about an export functionality from Voxforge which would be an import feature from simons point of view. And as I stated in my previous blog post this is something that I am indeed very interested in.

I don’t know about the exotic BOMP standard, I couldn’t find an entry in the Wikipedia. So I assume that BOMP is not a relevant standard.

BOMP is no standard at all. It is a dictionary following the HADIFIX "standard". HADIFIX is a speech synthesis project that uses phonetic dictionaries to know how to pronounce the words. Those dictionaries have to follow a specific format which could be called the "HADIFIX standard" (I have not found a definition of it anywhere).
The import functionality was implemented because a very large, high quality phonetic dictionary (the "BOMP" dictionary) exists following that format.

Simon allows me to record just single words, not utterances. I am not convinced by that concept.

I wouldn't be either. Fortunately, this is not true. Take a look at the Training module. You can easily import "normal" Texts. Try to input a text file containing this: "I am an utterance. And here comes another.". Even the standard examples shipped with simon contain sentences and not individual words, btw.

You see, there are several aspects. The world is not just about simon. It is about Voxforge, too.

Please don't lecture me. It is disrespectful and unnecessary. I am very grateful of the effort that Ken and all the contributers put into Voxforge and actively promote participation when people ask me about dictation with simon.

I am also investigating how to best use the voxforge model with simon and have stated on several occasions that I have intention to integrate the possibility to contribute to voxforge from within simon.

Followup 16.05.2009
Today I have been contacted by ralfherzog by e-Mail where he explained the misunderstanding.

Keine Kommentare: