Defining TermBases

Here at Supertext we’re contemplating – just contemplating, mind – the thought of building our own termbase. To this end I recently looked into what features are currently available if we were to choose an off-the-shelf termbase, and what we would want to implement in our own.

The Purpose

I’ll spare you the details of the work process behind each job at Supertext, but suffice to say that our aim is to improve the workflow while hopefully making our freelancers’ jobs easier. The overwhelming majority of our work is carried out using Trados. We believe that we can automate part of our workflow by semi-automating the Trados packages. Furthermore, we also have plans which will not only make it easier but also encourage freelancers to work with the termbase.

Another aspect of our purpose is to foment not only a house style but also a homologated style for each customer, and to preserve this style across jobs.

The Fundamentals

We found a helpful online course here which explains the fundamentals of a termbase. For example, the decision must be taken early on whether the termbase will be term-led or concept-led. In the case of the former, the database stores its terms with one or more concepts being ‘attached’ to this term. For conceptual-modeled termbases the database stores a collection of concepts, with one or more terms then being attached. It is likely that any Supertext termbase would be a conceptually-oriented (onomasiological) model.

At this stage I should probably explain some basic points: the concept in a termbase can be regarded as the overarching word or combination of words, under which the explanatory translations, or terms, can be collected. Or to put it another way, the concept is the collection of terms which serve to describe the concept: a herding of contextually-similar words. Furthermore, in a multi-lingual termbase (as we’re thinking of building) it is the terms which contain the translations.

Another recurring notion across termbases is the dictionary, though some termbase applications simply refer to these as the termbases. The dictionary is simply a collection of concepts and is thus the uppermost entity in the model. Concepts are collected into dictionaries so that a context may be set. For example, the term developer in the world of software engineering has a similar but distinctly different connotation to developer in the wold of real estate or property management, and in the domain of startup companies or business management developer relates to yet a different person.

Something that we’ll probably aim for, should we construct a Supertext termbase, is public dictionaries which are accessible by all, and customer/company-specific dictionaries which are writable by only personnel with specific access.

So, by this point our potential termbase is going to contain a collection of public and private dictionaries, inside of which are collections of concepts, each of which has terms which themselves are also translations.

First Impressions

The first thing that struck me about the current crop of termbases is how dated they look. There is one, termbases.eu, which has a modern appearance (which it clearly owes to Twitter’s Bootstrap), but the rest look like the height of 1990s web design. Yes, they look professionally-built and with a sense of security and reliability, but I was left wondering when any development work last took place. Maybe we’re just spoilt by web trends changing so frequently – after all, how many offices are renovated every 4 or 5 years?

TermWeb's appearance suggests that HTML 4.1 is still in vogue and that proprietary user controls are still acceptable.

TermWeb’s appearance suggests that HTML 4.1 is still in vogue and that proprietary user controls are still acceptable.

termbases.eu has the most contemporary appearance, due entirely to Twitter's Bootstrap CSS framework.

termbases.eu has the most contemporary appearance, due entirely to Twitter’s Bootstrap CSS framework, though this doesn’t make the UI any easier to use.

Some termbases, such as TermWeb, refer to the notion of a dictionary as a domain. Another termbase, WebTerm, either doesn’t use such a concept as a dictionary/domain or else masks this from the user. Another, termbases.eu actually refers to these dictionary-like collections as termbases.

WebTerm appears to offer nothing in the way of options, other than specifying the language.

WebTerm appears to offer nothing in the way of options, other than specifying the language.

Another feature that is commonly listed on existing termbases is that they support unicode. If there is to be any hope of a termbase being multilingual this is surely a prerequisite.

As well as being multilingual, another mandatory feature is the ability to import from existing termbases. As is common to almost all, we’ll most likely allow import/export using the standardised TBX format. Others allow importing from Excel and CSV files. Both of these are readily possible for us, the only problem is that once uploaded you have to present other options to the user in order for them to specify what data is where. There’s no set format for the data structure of a termbase so without having a hierarchical document format like TBX then you need to ask the user to provide some sort of hierarchical statement.

Other Features

The notion of exporting brings about another possibility: commonly files are sent along with the jobs to freelance translators which aid them in translating the work in a previously-accepted style. In line with this it might be useful to present a URL which directly downloads a TBX containing the pertinent terms. But certainly, one of our goals is to be able to provide URLs to specific dictionaries, concepts and terms. I mentioned at the start of this post that we’re hoping to streamline the workflow of both our language managers and freelancers, and something that we’ll take a look at is building a custom provider for Trados which will allow our Freelancers to hook up to our termbase with minimal effort, thus removing the need to pass extra files back and forth.

The ability to search is also pretty much mandatory, but the strength of the tool will rely upon how easily users are allowed to search, how many options they are presented with. I’m sure fuzzy search and wildcard characters, for example, would be welcome and well-used.

As far as results go, the intention would be to search for a matching term and when found, display all other terms collected under the same concept. However, to take it one step further, typically a termbase will allow one record (i.e. term or concept) to reference another, and some can go into a fairly technical depth to allow the user to declare what type of reference. For example, the relationship may be expressed as a homonym, paronym, antonym, hyponym, hypernym (superordinate), meronym, or holonym. There’s definitely some value to this but I would question – and please comment if your experience or opinion is to the contrary – that translators would rarely search a termbase to find the hyponym of a term. On the other hand I do see good reason for encouraging links based on synonyms. I also see merit in providing the ability to allow an editor to create a reference to any other term in the entire termbase, along with an explanation of why this might be relevant.

The whole idea of having a Supertext termbase is to greater enable our freelance translators to do their job accurately and expediently. However, a termbase is only as good as its contents, and to this end we envisage that out termbase will be set up similar to a wiki. That is to say that the freelance translators may update it as well as query it, both via a number of ways. That said, anything open in this manner is susceptible to inaccuracy, or simple misuse, and therefore we also plan to implement some sort of changelog and rights management. Both WebTerm and TermWeb offer a fairly full-featured changelog which allows users to see the date/time it was created and by whom, as well as all edits, comments, etc., but a great feature of this is being able to click on a historical change and have that iteration become current. Accordingly, this also raises questions about the approval process and so forth. All of which have yet to be fully fathomed here in the Supertext office.

Who knows, a Supertext termbase may never see the light of day, but it would certainly be an interesting project to work on.

 




Ähnliche Beiträge


Leave a Reply

Your email address will not be published. Required fields are marked *



*