| ||||||||||
![]() |
|
STUDY ABROAD OPTIONS
|
|
|||
WEEKLY PUBLICATION DEADLINE: 12 pm GMT Sunday. URGENT ANNOUNCEMENT - WE NEED 240 BOOKKEEPING DATA ENTRY OPERATORS ISSUE NO 69 THE KNOTTY PROBLEM OF USING AFRICAN LANGUAGES FOR E-MAIL AND INTERNETAs the web and e-mail spread, Africans will increasingly want to have information and communicate in their own languages. For the purposes of generating and transmitting text electronically, Africa has three categories of languages: those that use basically the same characters one finds in the major languages of West European origin; those which use basically that same Latin alphabet but with some added letters; and those which use non-Latin alphabets. The last two pose considerable problems for those wanting to see digital advances as a way of improving communication. Bisharats Don Osborn looks at how these obstacles can be tackled. As the information revolution worldwide becomes increasingly multilingual, and as the new technologies in Africa gradually move beyond the capital cities, what are the barriers to greater use of the indigenous languages of the continent? There are of course a number of interrelated issues to consider in a comprehensive discussion of this question, which one might broadly characterize as including: structural issues (e.g., basic physical access to the technology, technical problems), socio-linguistic factors (issues relating to orthographies, literacy, multiplicity of languages and dialect variation within languages, and attitudes about languages), economic considerations (lack of resources, other priorities in using IT for development), and even political concerns (what effect would validating linguistic diversity in the new technologies have on divisions in a society). This short article highlights a fairly narrow but significant technical matter: the current possibilities for generation and transmission of text in African languages. Text of course means characters, and the larger the number of characters outside of the set used in the main languages of IT, the more complicated the problem becomes. Since all African languages are not the same in their orthographies, it is useful to group them in three categories in order to consider what is involved and is actually being done: 1. those that use basically the same characters one finds in the major languages of West European origin; 2. those which use basically that same Latin alphabet but with some added letters; 3. and those which use non-Latin alphabets. 1. For African languages of the first category ­ that use the Latin alphabet of European languages ­ there are no special technical problems to working with text, production of web content, or even software localization. This is especially the case for languages like Swahili, Somali, and many in Southern Africa that use only ASCII characters (i.e., no accents).Even languages such as Sango that use several accented characters common to major European languages can be readily used in word-processing and on the web (see for example http://sango.free.fr/). 2. However, many African ­ and most West African ­ languages in their officially adopted orthographies use the Latin alphabet with a few extra or different characters/letters or less-common digraphs to represent sounds not found in major European languages. The extended alphabet adopted by many countries for their maternal languages had its genesis at a conference of African language experts held in Bamako in 1966.For using the special characters on computers and the internet there are several approaches: (a) The "correct" one. That is, in a word processor to have a font that includes these characters. There actually seems to be a growing number of such fonts, often created to meet specific needs on a local level or as part of a commercial line of multilingual software. Unfortunately they and the keyboard arrangements for them are generally incompatible. For the web, that means being able to have these added characters in a text with a standard code for each character, a single code set including these, and some standard set of glyphs on the receiving end that a browser would call up to represent them. Unicode is proposed as a solution to this (as well as to the lack of standardization for wordprocessing).However it is not there yet as you might find, depending on how your browser handles Unicode (utf-8), in looking at the Fula (Peulh, Pulaar), Ewe, Kabye, and Maninka versions of the Universal Declaration of Human Rights at http://www.unhchr.ch/udhr/navigate/region.htm.If you get a lot of empty boxes in the texts then you can see why people still are using workarounds such as below to create and share text in these languages. (b) The "old-correct" or obsolete one. That is, for some languages such as Bambara, Ewe, or Fula (Pular/Fuuta Jalon) some digraphs or accented characters used in European languages were employed before the special characters of the extended alphabet were officially adopted (e.g., "ny" or "n tilda" for the "n with left hook"; "o accent grave" or "underlined o" for the "open o"; "dh" for the "hooked d"). This is the approach I used when typing Bambara for class years ago or in e-mail more recently. It lets one produce and present text, but is not satisfactory to those who have learned in and/or are used to using the current orthography. Also, accents might be confused with tone indicators used in texts for some of the tonal languages (Bambara, Yoruba). A site with the "old-correct" transcription of Fula (Pular/Fuuta Jalon) is: http://www.fuuta-jalon.net/Pular/pular.html (c) The substitute solution:
Use something that stands for the special characters. For instance
use capital letters in place of the special characters (e.g.,
"E" for the "open e"). An example in Bambara
can be seen at: http://callisto.si.usherb.ca/~malinet/index_ba.html (d) The "little
image file" solution where little image files are used for
the special characters inserted as needed in the text. This is
very cumbersome except for short texts. A site where that was
done, for Bambara, is http://www.djembe.com/bambara_1.cfm
.A Wolof learning site uses a little image file to help readers
ascertain whether their browser can read the letter "eng": (e) The "big image
file" solution. Where text in proper orthography is turned
into image files (.jpg or .pdf), usually for the web. One example
for Fula (Pulaar) is at the bottom of the page at http://africandl.org/fuuta_lib/aan_pulaar-eng.html
; (f) The "whatever
works easiest" (or "fast & dirty") solution.
That is, just use the closest standard Latin letter for each
special character (e.g., "e" for the "open e").
This was done with Bambara at: (g) "Hybrid"
solutions are a mix of a couple of the above. For example, Wolof
text at http://www.bok.net/pajol/index.wo.html
uses accented characters but not the letter "eng."
And two sites with Fula (Pular/Fuuta Jalon) deal in different
ways with the transition from the old transcription to the new,
the one cited in (b) above and 3. For African languages with their own script,
such as the Geez used in Ethiopian and Eritrean languages
or Tifinagh used in Tamasheq and Berber, special coding is necessary.
This process is already well advanced for Arabic, and is not
unlike what has been or is being dealt with for the several non-Latin
alphabets used across much of Eurasia.In the case of Geez,
apparently several font + keyboard packages are available raising
the problem of mutually incompatible systems and the desirability
of some sort of standardization, as explored in The technical issues relating to producing and sharing text in extended-Latin or non-Latin characters are not the only or even the most significant impediments to increased African language use on computers and the internet. And these problems can be got around in one way or another when people have a mind (and means) to do so. However, as increasing numbers of people in Africa encounter the new technologies, a multiplicity of incompatible and often ad hoc systems for processing their languages in computer software, and the various alternative solutions for display of text with special or non-Latin characters on the internet will not be able to serve them well. A shorter version of this article was included in the Open University Development and Environment Society Update (July 2001) <http://www.geocities.com/oudesociety>; it was derived from a note originally posted on the <africa_web_content_owner@YahooGroups.com> list (Feb. 2000). Don Osborn is Associate Director for Agriculture with Peace Corps in Niger (the views expressed are his own). He is also the founder of the Bisharat Language, Technology & Development Initiative (www.kabissa.org/bisharat).
WEEKLY PUBLICATION
DEADLINE: 12pm Sunday |
|
![]() ![]() ![]() ![]()
![]() |
||
|
This page last updated on January 28 2004. |
||||