======================================================================== * icu/packaging/README ======================================================================== Copyright (C) 2000-2003, International Business Machines Corporation and others. All Rights Reserved. This directory contains information, input files and scripts for packaging ICU using specific packaging tools. We assume that the packager is familiar with the tools and procedures needed to build a package for a given packaging method (for example, how to use dpkg-buildpackage(1) on Debian GNU/Linux, or rpm(8) on distributions that use RPM packages). Please read the file PACKAGES if you are interested in packaging ICU yourself. It describes what the different packages should be, and what their contents are. ======================================================================== * icu/source/extra/uconv/README ======================================================================== Copyright (c) 2002, International Business Machines Corporation and others. All Rights Reserved. The uconv command is an iconv(1)-like conversion / transcoding program. Please check its manual page, or run uconv -h, for help. Help, as well as error messages, are displayed through the use of a resource bundle. Please contact Steven Loomis if you want to offer a translation of these messages for a particular locale. uconv was originally written and contributed to icuapps by Jonas Utterström , and offered simple conversion and a way to know which encodings were available. It has since then be moved to the main ICU distribution and converted to the C conversion API, and is maintained by Yves Arrouye who seems to always be looking for one more feature or option to add to the tool. ======================================================================== * icu/source/samples/break/readme.txt ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. break: Boundary Analysis This sample demonstrates Using ICU to determine the linguistic boundaries within text Files: break.cpp Main source file in C++ ubreak.c Main source file in C break.sln Windows MSVC workspace. Double-click this to get started. break.vcproj Windows MSVC project file To Build break on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\break\break.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the break directory, e.g. cd c:\icu\source\samples\break\debug 4. Run it (Warning: Be careful, 'break' is also a system command on many systems) .\break To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/break gmake ICU_PREFIX=/source/samples/break gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH break Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/cal/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. icucal: a sample program which displays the calendar. This sample demonstrates Formatting a calendar Outputting text in the default codepage to the console Files: cal.c Main source file uprint.h codepage output convenience header uprint.h codepage output convenience implementation cal.sln Windows MSVC workspace. Double-click this to get started. cal.vcproj Windows MSVC project file To Build icucal on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\cal\cal.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the cal directory, e.g. cd c:\icu\source\samples\cal\debug 4. Run it cal To Build on Unixes 1. Build ICU. icucal is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/cal gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/case/readme.txt ======================================================================== Copyright (c) 2003-2005, International Business Machines Corporation and others. All Rights Reserved. case: case mapping This sample demonstrates Using ICU to convert between different cases Files: case.cpp Main source file in C++ ucase.c Main source file in C case.sln Windows MSVC workspace. Double-click this to get started. case.vcproj Windows MSVC project file To Build case on Windows 1. Install and build ICU 2. In MSVC, open the solution file icu\samples\case\case.sln (or, use the workspace All, in icu\samples\all\all.sln ) 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the case directory, e.g. cd c:\icu\source\samples\case\debug 4. Run it case To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/case gmake ICU_PREFIX=/source/samples/case gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH case Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/citer/readme.txt ======================================================================== Copyright (c) 2003-2010, International Business Machines Corporation and others. All Rights Reserved. citer: Character Iteration This sample demonstrates Demonstrating ICU's CharacterIterator Files: citer.cpp Main source file in C++ citer.sln Windows MSVC workspace. Double-click this to get started. citer.vcproj Windows MSVC project file To Build citer on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\citer\citer.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the citer directory, e.g. cd c:\icu\source\samples\citer\debug (note that it may be in a different relative directory than most of the other samples). 4. Run it citer To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/citer gmake ICU_PREFIX=/source/samples/citer gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH citer Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/coll/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. coll: a sample program which compares 2 strings with a user-defined collator. This sample demonstrates Creating a user-defined collator Comparing 2 string using the collator created Files: coll.c Main source file coll.sln Windows MSVC workspace. Double-click this to get started. coll.vcproj Windows MSVC project file To Build coll on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\coll\coll.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the coll directory, e.g. cd c:\icu\source\samples\coll\debug 4. Run it coll [options*] -source source_string -target target_string To Build on Unixes 1. Build ICU. coll is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/coll gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/csdet/readme.txt ======================================================================== Copyright (c) 2001-2010 International Business Machines Corporation and others. All Rights Reserved. uresb: Resource Bundle This sample demonstrates Using ICU's CharSet Detection API Files: csdet.c Main source file *.txt Various sample .txt files To Build uresb on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uresb\uresb.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uresb directory, e.g. cd c:\icu\source\samples\uresb\debug 4. Run it (with a locale name, ex. english) csdet eucJP.txt WARNING: The .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uresb gmake ICU_PREFIX= To Run on Unixes cd /source/samples/uresb gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH csdet eucJP.txt Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/date/readme.txt ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. icudate: a sample program which displays the current date This sample demonstrates Formatting a date Outputting text in the default codepage to the console Files: date.c Main source file uprint.h codepage output convenience header uprint.h codepage output convenience implementation date.sln Windows MSVC workspace. Double-click this to get started. date.vcproj Windows MSVC project file To Build icudate on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\date\date.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the icudate directory, e.g. cd c:\icu\source\samples\date\debug 4. Run it (Warning: Be careful, 'date' is also a system command on many systems) .\date To Build on Unixes 1. Build ICU. icudate is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/date gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH date Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/datefmt/README.TXT ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. ** Workshop homepage is: http://www.icu-project.org/docs/workshop_2000/agenda.html #Date/Time/Number Formatting Support 9:30am - 10:30am Alan Liu Topics: 1. What is the date/time support in ICU? 2. What is the timezone support in ICU? 3. What kind of formatting and parsing support is available in ICU, i.e. NumberFormat, DateFormat, MessageFormat? INSTRUCTIONS ------------ This exercise was first developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "datefmt.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: Set up the program, build it, and run it. To start with, the program prints out a list of languages. Problem 1: Basic Date Formatting (Easy) Create a calendar, and use it to get the UDate for June 4, 1999, 0:00 GMT (or any date of your choosing). You will have to create a TimeZone (use the createZone() function already defined in main.cpp) and a Calendar object, and make the calendar use the time zone. Once you have the UDate, create a DateFormat object in each of the languages in the LANGUAGE array, and display the date in that language. Use the DateFormat::createDateInstance() method to create the date formatter. Problem 2: Date Formatting, Specific Time Zone (Medium) To really localize a time display, one can also specify the time zone in which the time should be displayed. For each language, also create different time zones from the TIMEZONE list. To format a date with a specific calendar and zone, you must deal with three objects: a DateFormat, a Calendar, and a TimeZone. Each object must be linked to another in correct sequence: The Calendar must use the TimeZone, and the DateFormat must use the Calendar. DateFormat =uses=> Calendar =uses=> TimeZone Use either setFoo() or adoptFoo() methods, depending on where you want to have ownership. NOTE: It's not always desirable to change the time to a local time zone before display. For instance, if some even occurs at 0:00 GMT on the first of the month, it's probably clearer to just state that. Stating that it occurs at 5:00 PM PDT on the day before in the summer, and 4:00 PM PST on the day before in the winter will just confuse the issue. NOTES ----- To see a list of system TimeZone IDs, use the TimeZone::create- AvailableIDs() methods. Alternatively, look at the file icu/docs/tz.htm. This has a hyperlinked list of current system zones. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * icu/source/samples/legacy/README ======================================================================== Copyright (c) 2002, International Business Machines Corporation and others. All Rights Reserved. This example demonstrates running an instance of ICU 1.8.1. together with a current version of ICU. It only tests u_getVersion and several collation APIs. Generally, one should be able to simultaneously use one or more versions of ICU 2.0 or higher and one version of ICU 1.8.1 or lower. What is it all about: Let's say you have a 10 Tb database indexed using ICU 1.8.1. sortkeys. New ICU comes out, with neat new features you would like to use, but also with new sortkeys and you don't care to reindex your 10 Tb database. What to do then??? You can use ICU 1.8.1. in one of your compilation units and current version in all the others. So, you can use old collation until you decide to reindex. You cannot mix two versions of ICU in the same compilation unit. You cannot automatically use more than one legacy version of ICU. In order to make the compilation unit use old version of ICU, you have to do a couple of things: 1) change it's include path so that it includes header files from the old versions 2) explicitly add old libraries to the linker. 3) make sure old data can be found (if legacy code needs data). Building and running of the example: Linux: To make it work, you should build and install both the current ICU and ICU 1.8.1. Put both data libraries to wherever ICU_DATA points (usually it is $(prefix)/share/icu/$(icu_version)/). If data libraries are used, then check for $(prefix)/lib/icu/1.8.1 which should contain libicudata.so and libicudt18*.so 2. Copy libicuuc.so.18* and libicui18n.so.18* to $(prefix)/lib directory, together with current libraries). 3. Should work on other Unixes. Change $ICU_PREFIX to point to the current installation, and $ICU_LEGACY to point to 1.8.1 installation. $ICU_LEGACY is needed solely to access the 1.8.1 include directory through $LEGACY_INCLUDE variable, so if you want to move the 1.8.1. include directory, you can set $LEGACY_INCLUDE directly to that directory. Run make check. You should get two different libraries running at the same time. Win32: Build both current ICU and ICU 1.8.1. Take icuuc18.dll, icuin18.dll and icudt18l.dll and put them somewhere in PATH (a sane place would be wherever current dlls go). Edit the include directory for oldcol.cpp so that it points to the include directory of ICU 1.8.1. Edit the two library entries with path so that they point to .lib files for your version of ICU. Hit F7, followed by ctrl-F5. Troubleshooting (all platforms): Sample won't compile: this is quite unlikely, but the most probable reason is that include files cannot be found. Sample won't link: The path for 1.8.1. libraries is broken. Edit it so that it reflects the path to your libraries. Linker says: "Undefined symbol u_getVersion()" (or something similar): path to 1.8.1. libraries is bad. Linker says: "Undefined symbol u_getVersion()_X_Y" (or something similar): path to current libraries is bad. Legacy crashes horribly: Sorry, didn't put any error checking. If legacy crashes that's most probably because it cannot find the data libraries. You can see which data library is not found by the part of the program that is running. Make sure program can find tha data library either by putting it where ever ICU_DATA points to OR by putting the DLL version of the data library somewhere on your PATH. ======================================================================== * icu/source/samples/msgfmt/README.TXT ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. http://www.icu-project.org/docs/workshop_2000/agenda.html Day 2: September 12th 2000 Pre-requisites: 1. All the hardware and software requirements from Day 1. 2. Attended or fully understand Day 1 material. 3. Read through the ICU user's guide at http://www.icu-project.org/userguide/. #Date/Time/Number Formatting Support 9:30am - 10:30am Alan Liu Topics: 1. What is the date/time support in ICU? 2. What is the timezone support in ICU? 3. What kind of formatting and parsing support is available in ICU, i.e. NumberFormat, DateFormat, MessageFormat? INSTRUCTIONS ------------ This exercise was first developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "msgfmt.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: Set up the program, build it, and run it. To start with, the program prints out the word "Message". Problem 1: Basic Message Formatting (Easy) Use a MessageFormat to create a message that prints out "Received argument(s) on .", where n is the number of command line arguments (use argc-1), and d is the date (use Calendar::getNow()). HINT: Your message pattern should have a "number" element and a "date" element, and you will need to use Formattable. Problem 2: ChoiceFormat (Medium) We can do better than "argument(s)". Instead, we can display more idiomatic strings, such as "no arguments", "one argument", "two arguments", and for higher values, we can use a number format. This kind of value-based switching is done using a ChoiceFormat. However, you seldom needs to create a ChoiceFormat by itself. Instead, most of the time you will supply the ChoiceFormat pattern within a MessageFormat pattern. Use a ChoiceFormat pattern within the MessageFormat pattern, instead of the "number" element, to display more idiomatic strings. EXTRA: Embed a number element within the choice element to handle values greater than two. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * icu/source/samples/numfmt/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. numfmt: a sample program which displays number formatting in C and C++ This sample demonstrates Formatting a number Outputting text in the default codepage to the console Files: main.cpp Main source file in C++ capi.c C version util.cpp formatted output convenience implementation util.h formatted output convenience header numfmt.sln Windows MSVC workspace. Double-click this to get started. numfmt.vcproj Windows MSVC project file To Build on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\numfmt\numfmt.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the numfmt directory, e.g. cd c:\icu\source\samples\numfmt\debug 4. Run it numfmt To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/numfmt gmake ICU_PREFIX=/source/samples/numfmt gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH numfmt Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/props/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. props: Unicode Character Properties This sample demonstrates Using ICU to determine the properties of Unicode characters Files: props.cpp Main source file in C++ props.sln Windows MSVC workspace. Double-click this to get started. props.vcproj Windows MSVC project file To Build props on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\props\props.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the props directory, e.g. cd c:\icu\source\samples\props\debug 4. Run it props To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/props gmake ICU_PREFIX=/source/samples/props gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH props Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/readme.txt ======================================================================== ## Copyright (c) 2002-2010, International Business Machines Corporation ## and others. All Rights Reserved. This directory contains sample code Below is a short description of the contents of this directory. break - demonstrates how to use BreakIterators in C and C++. cal - prints out a calendar. case - demonstrates how to do Unicode case conversion in C and C++. csdet - demonstrates using ICU's CharSet Detection API date - prints out the current date, localized. datefmt - an exercise using the date formatting API layout - demonstrates the ICU LayoutEngine legacy - demonstrates using two versions of ICU in one application msgfmt - demonstrates the use of the Message Format numfmt - demonstrates the use of the number format props - demonstrates the use of Unicode properties strsrch - demonstrates how to search for patterns in Unicode text using the usearch interface. translit - demonstrates the use of ICU transliteration uciter8.c - demonstrates how to leniently read 8-bit Unicode text. ucnv - demonstrates the use of ICU codepage conversion udata - demonstrates the use of ICU low level data routines (reader/writer in 'all' MSVC solution) ufortune - demonstrates packaging and use of resources in an application ugrep - demonstrates ICU Regular Expressions. uresb - demonstrates building and loading resource bundles ustring - demonstrates ICU string manipulation functions == * Where can I find more sample code? - The "uconv" utility is a full-featured command line application. It is normally built with ICU, and is located in icu/source/extra/uconv - The "icuapps" CVS module contains other applications and libraries not included with ICU. You can check it out from the CVS command line by using for example, "cvs co icuapps" instead of "cvs co icu", or through WebCVS at http://dev.icu-project.org/cgi-bin/viewcvs.cgi/icuapps/ == * How do I build the samples? - See the Readme in each subdirectory To build all samples at once: Windows MSVC: - build ICU - open 'all' project file in 'all' subdirectory - build project - sample executables will be located in /x86/Debug folders of each sample subdirectory Unix: - build and install (make install) ICU - be sure 'icu-config' is accessible from the PATH - type 'make all-samples' from this directory (other targets: clean-samples, check-samples) Note: 'make all-samples' won't work correctly in out of source builds. - legacy and layout are not included in these lists, please see their individual readmes. ======================================================================== * icu/source/samples/strsrch/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. strsrch: a sample program which finds the occurrences of a pattern string in a source string, using user-defined collation rules. This sample demonstrates Creating a user-defined string search mechanism. Finding all occurrences of a pattern string in a given source string. Files: strsrch.c Main source file strsrch.sln Windows MSVC workspace. Double-click this to get started. strsrch.vcproj Windows MSVC project file To Build strsrch on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\strsrch\strsrch.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the strsrch directory, e.g. cd c:\icu\source\samples\strsrch\debug 4. Run it strsrch [options*] -source source_string -pattern pattern_string To Build on Unixes 1. Build ICU. strsrch is built automatically by default unless samples are turned off. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install To Run on Unixes cd /source/samples/strsrch gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH cal Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/translit/README.TXT ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. IMPORTANT: This sample was originally intended as an exercise for the ICU Workshop (September 2000). The code currently provided in the solution file is the answer to the exercises, each step can still be found in the 'answers' subdirectory. http://www.icu-project.org/docs/workshop_2000/agenda.html Day 2: September 12th 2000 Pre-requisite: 1. All the hardware and software requirements from Day 1. 2. Attended or fully understand Day 1 material. 3. Read through the ICU user's guide at http://www.icu-project.org/userguide/. #Transformation Support 10:45am - 12:00pm Alan Liu Topics: 1. What is the Unicode normalization? 2. What kind of case mapping support is available in ICU? 3. What is Transliteration and how do I use a Transliterator on a document? 4. How do I add my own Transliterator? INSTRUCTIONS ------------ This exercise was developed and tested on ICU release 1.6.0, Win32, Microsoft Visual C++ 6.0. It should work on other ICU releases and other platforms as well. MSVC: Open the file "translit.sln" in Microsoft Visual C++. Unix: - Build and install ICU with a prefix, for example '--prefix=/home/srl/ICU' - Set the variable ICU_PREFIX=/home/srl/ICU and use GNU make in this directory. - You may use 'make check' to invoke this sample. PROBLEMS -------- Problem 0: To start with, the program prints out a series of dates formatted in Greek. Set up the program, build it, and run it. Problem 1: Basic Transliterator (Easy) The Greek text shows up almost entirely as Unicode escapes. These are unreadable on a US machine. Use an existing system transliterator to transliterate the Greek text to Latin so it can be phonetically read on a US machine. If you don't know the names of the system transliterators, use Transliterator::getAvailableID() and Transliterator::countAvailableIDs(), or look directly in the index table icu/data/translit_index.txt. Problem 2: RuleBasedTransliterator (Medium) Some of the text is still unreadable and shows up as Unicode escape sequences. Create a RuleBasedTransliterator to change the unreadable characters to close ASCII equivalents. For example, the rule "\u00C0 > A;" will change an 'A' with a grave accent to a plain 'A'. To save typing, use UnicodeSets to handle ranges of characters. See the included file "U0080.pdf" for a table of the U+00C0 to U+00FF Unicode block. Problem 3: Transliterator subclassing; Normalizer (Difficult) The rule-based approach is flexible and, in most cases, the best choice for creating a new transliterator. Sometimes, however, a more elegant algorithmic solution is available. Instead of typing in a list of rules, you can write C++ code to accomplish the desired transliteration. Use a Normalizer to remove accents from characters. You will need to convert each character to a sequence of base and combining characters by applying a canonical denormalization transformation. Then discard the combining characters (the accents etc.) leaving the base character. Wrap this all up in a subclass of the Transliterator class that overrides the pure virtual handleTransliterate() method. ANSWERS ------- The exercise includes answers. These are in the "answers" directory, and are numbered 1, 2, etc. In some cases new files that the user needs to create are included in the answers directory. If you get stuck and you want to move to the next step, copy the answers file into the main directory in order to proceed. E.g., "main_1.cpp" contains the original "main.cpp" file. "main_2.cpp" contains the "main.cpp" file after problem 1. Etc. Have fun! ======================================================================== * icu/source/samples/uciter8/readme.txt ======================================================================== Copyright (c) 2003-2005, International Business Machines Corporation and others. All Rights Reserved. uciter8: Lenient reading of 8-bit Unicode with a UCharIterator This sample demonstrates reading 8-bit Unicode text leniently, accepting a mix of UTF-8 and CESU-8 and also accepting single surrogates. UTF-8-style macros are defined as well as a UCharIterator. The macros are incomplete (do not assemble code points from pairs of surrogates) but sufficient for the iterator. If you wish to use the lenient-UTF/CESU-8 UCharIterator in a context outside of this sample, then copy the uit_len8.c file, as well as either the uit_len8.h header or just the prototype that it contains. *** Warning: *** This UCharIterator reads an arbitrary mix of UTF-8 and CESU-8 text. It does not conform to any one Unicode charset specification, and its use may lead to security risks. Files: uciter8.c Main source file in C uit_len8.c Lenient-UTF/CESU-8 UCharIterator implementation uit_len8.h Header file with the prototoype for the lenient-UTF/CESU-8 UCharIterator uciter8.sln Windows MSVC workspace. Double-click this to get started. uciter8.vcproj Windows MSVC project file To Build uciter8 on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uciter8\uciter8.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uciter8 directory, e.g. cd c:\icu\source\samples\uciter8\debug 4. Run it uciter8 To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uciter8 gmake ICU_PREFIX=/source/samples/uciter8 gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH uciter8 Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/ucnv/readme.txt ======================================================================== Copyright (C) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. convsamp: a sample program which demonstrates using ICU conversion This sample demonstrates Opening and closing converters using the C api String manipulation in C Writing a custom conversion callback function Files: convsamp.c Main source file flagcb.h codepage output convenience header flagcb.c codepage output convenience implementation ucnv.sln Windows MSVC workspace. Double-click this to get started. ucnv.vcproj Windows MSVC project file To Build ucnv on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ucnv\ucnv.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ufortune directory, e.g. cd c:\icu\source\samples\ucnv\debug 4. Run it ucnv WARNING: The .bin and .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build set the variable ICU_PREFIX= gmake all To Run on Unixes cd /source/samples/ucnv gmake check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH convsamp Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/udata/readme.txt ======================================================================== Copyright (c) 2002-2010, International Business Machines Corporation and others. All Rights Reserved. udata: Low level ICU data This sample demonstrates Using the low level ICU data handling interfaces (udata) to create and later access user data. Files: writer.c C source for Writer application, will generate data file to be read by Reader. reader.c C source for Reader application, will read file created by Writer udata.sln Windows MSVC workspace. Double-click this to get started. udata.vcproj Windows MSVC project file To Build udata on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\udata\udata.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the udata directory, e.g. cd c:\icu\source\samples\udata\debug 4. Run it writer reader IMPORTANT: On some systems, the reader and writer executables may not be in the same directory. If this is the case, this will likely cause a problem with reader looking for the .dat file in the wrong directory). To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile You will need to set ICU_PATH to the location of your ICU source tree, for example ICU_PATH=/home/srl/icu (containing source, etc.) cd /source/samples/udata gmake ICU_PATH= ICU_PREFIX=/source/samples/udata gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH writer reader Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/ufortune/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ufortune: a sample program demonstrating the use of ICU resource files by an application. This sample demonstrates Defining resources for use by an application Compiling and packaging them into a dll Referencing the resource-containing dll from application code Loading resource data using ICU's API Files: ./ufortune.c source code for the sample ./ufortune.sln Windows MSVC workspace. Double-click this to get started. ./ufortune.vcproj Windows MSVC project file. ./Makefile Makefile for Unixes. Needs gmake. resources/root.txt Default resources (text for messages in English) resources/es.txt Spanish language resources source file.. resources/res-file-list.txt List of resource source files to be built resources/Makefile Makefile for compiling resources, for Unixes. To Build ufortune on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ufortune\ufortune.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ufortune directory, e.g. cd c:\icu\source\samples\ufortune\debug 4. Run it ufortune To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build the sample cd /source/samples/ufortune export ICU_PREFIX= gmake To Run on Unixes cd /source/samples/ufortune gmake check or export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ufortune Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/ugrep/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ugrep: a sample program demonstrating the use of ICU regular expression API. usage: ugrep [options] pattern [file ...] --help Output a brief help message -n, --line-number Prefix each line of output with the line number within its input file. -V, --version Output the program version number The program searches for the specified regular expression in each of the specified files, and outputs each matching line. Input files are in the system default (locale dependent) encoding, unless they begin with a BOM, in which case they are assumed to be in the UTF encoding specified by the BOM. Program output is always in the system's default 8 bit code page. Files: ./ugrep.c source code for the sample ./ugrep.sln Windows MSVC workspace. Double-click this to get started. ./ugrep.vcproj Windows MSVC project file. ./Makefile Makefile for Unixes. Needs gmake. To Build ugrep on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ugrep\ugrep.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ugrep directory, e.g. cd c:\icu\source\samples\ugrep\debug 4. Run it ugrep ... To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Build the sample Put the install directory containing icu-config on the $PATH. This will generally be /bin cd /source/samples/ugrep gmake To Run on Unixes cd /source/samples/ugrep export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ugrep ... Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/uresb/readme.txt ======================================================================== Copyright (c) 2001-2010 International Business Machines Corporation and others. All Rights Reserved. uresb: Resource Bundle This sample demonstrates Building a resource bundle Using ICU to print data from a resource bundle Files: uresb.c Main source file in C uresb.sln Windows MSVC workspace. Double-click this to get started. uresb.vcproj Windows MSVC project file resources.dsp Windows project file for resources resources.mak Windows makefile for resources root.txt Root resource bundle en.txt English translation sr.txt Serbian translation (cp1251) To Build uresb on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\uresb\uresb.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the uresb directory, e.g. cd c:\icu\source\samples\uresb\debug 4. Run it (with a locale name, ex. english) uresb en WARNING: The .txt files must be in the same directory as the executable, which is not the case by default on some systems. To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/uresb gmake ICU_PREFIX= To Run on Unixes cd /source/samples/uresb gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH uresb Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/samples/ustring/readme.txt ======================================================================== Copyright (c) 2002-2005, International Business Machines Corporation and others. All Rights Reserved. ustring: Unicode String Manipulation This sample demonstrates Using ICU to manipulate UnicodeString objects Files: ustring.cpp Main source file in C++ ustring.sln Windows MSVC workspace. Double-click this to get started. ustring.vcproj Windows MSVC project file To Build ustring on Windows 1. Install and build ICU 2. In MSVC, open the workspace file icu\samples\ustring\ustring.sln 3. Choose a Debug or Release build. 4. Build. To Run on Windows 1. Start a command shell window 2. Add ICU's bin directory to the path, e.g. set PATH=c:\icu\bin;%PATH% (Use the path to where ever ICU is on your system.) 3. cd into the ustring directory, e.g. cd c:\icu\source\samples\ustring\debug 4. Run it ustring To Build on Unixes 1. Build ICU. Specify an ICU install directory when running configure, using the --prefix option. The steps to build ICU will look something like this: cd /source runConfigureICU --prefix [other options] gmake all 2. Install ICU, gmake install 3. Compile cd /source/samples/ustring gmake ICU_PREFIX=/source/samples/ustring gmake ICU_PREFIX= check -or- export LD_LIBRARY_PATH=/lib:.:$LD_LIBRARY_PATH ustring Note: The name of the LD_LIBRARY_PATH variable is different on some systems. If in doubt, run the sample using "gmake check", and note the name of the variable that is used there. LD_LIBRARY_PATH is the correct name for Linux and Solaris. ======================================================================== * icu/source/tools/genren/README ======================================================================== Copyright (c) 2002-2011, International Business Machines Corporation and others. All Rights Reserved. The genren.pl script is used to generate source/common/unicode/urename.h header file, which is needed for renaming the ICU exported names. This script is intended to be used on Linux, although it should work on any platform that has Perl and nm command. Makefile may need to be updated, it's not 100% portable. It also does not currently work well in an out-of-source situation. The following instructions are for Linux version. - urename.h file should be generated after implementation is complete for a release. - the version number for a release should be set according to the list in source/common/unicode/uvernum.h - In this [genren] directory, run "make install-header" - urename.h will be updated in icu/source/common/unicode/urename.h **in your original source directory** - Eyeball the new file for errors - Other make targets here clean - cleans out intermediate files urename.h -just builds ./urename.h ======================================================================== * icu/source/tools/tzcode/readme.txt ======================================================================== ********************************************************************** * Copyright (c) 2003-2007, International Business Machines * Corporation and others. All Rights Reserved. ********************************************************************** * Author: Alan Liu * Created: August 18 2003 * Since: ICU 2.8 ********************************************************************** Note: this directory currently contains tzcode as of tzcode2006h.tar.gz with localtime.c patches from tzcode2006i.tar.gz ---------------------------------------------------------------------- OVERVIEW This file describes the tools in icu/source/tools/tzcode The purpose of these tools is to process the zoneinfo or "Olson" time zone database into a form usable by ICU4C (release 2.8 and later). Unlike earlier releases, ICU4C 2.8 supports historical time zone behavior, as well as the full set of Olson compatibility IDs. References: ICU4C: http://www.icu-project.org/ Olson: ftp://elsie.nci.nih.gov/pub/ ---------------------------------------------------------------------- ICU4C vs. ICU4J For ICU releases >= 2.8, both ICU4C and ICU4J implement full historical time zones, based on Olson data. The implementations in C and Java are somewhat different. The C implementation is a self-contained implementation, whereas ICU4J uses the underlying JDK 1.3 or 1.4 time zone implementation. Older versions of ICU (C and Java <= 2.6) implement a "present day snapshot". This only reflects current time zone behavior, without historical variation. Furthermore, it lacks the full set of Olson compatibility IDs. ---------------------------------------------------------------------- BACKGROUND The zoneinfo or "Olson" time zone package is used by various systems to describe the behavior of time zones. The package consists of several parts. E.g.: Index of ftp://elsie.nci.nih.gov/pub/ classictzcode.tar.gz 65 KB 12/10/1994 12:00:00 AM classictzdata.tar.gz 67 KB 12/10/1994 12:00:00 AM e5+57.tar.gz 2909 KB 3/22/1993 12:00:00 AM iso8601.ps.gz 16 KB 7/27/1996 12:00:00 AM leastsq.xls 49 KB 4/24/1997 12:00:00 AM ltroff.tar.gz 36 KB 7/16/1993 12:00:00 AM pi.shar.gz 4 KB 3/9/1994 12:00:00 AM tzarchive.gz 3412 KB 8/18/2003 4:00:00 AM tzcode2003a.tar.gz 98 KB 3/24/2003 2:32:00 PM tzdata2003a.tar.gz 132 KB 3/24/2003 2:32:00 PM ICU only uses the tzdataYYYYV.tar.gz files, where YYYY is the year and V is the version letter ('a'...'z'). This directory has partial contents of tzcode checked into ICU ---------------------------------------------------------------------- HOWTO 0. Note, these instructions will only work on POSIX type systems. 1. Obtain the current versions of tzdataYYYYV.tar.gz (aka `tzdata') from the FTP site given above. Either manually download or use wget: $ cd {path_to}/icu/source/tools/tzcode $ wget "ftp://elsie.nci.nih.gov/pub/tzdata*.tar.gz" 2. Copy only one tzdata*.tar.gz file into the icu/source/tools/tzcode/ directory (this directory). *** Make sure you only have ONE FILE named tzdata*.tar.gz in the directory. 3. Build ICU normally. You will see a notice "updating zoneinfo.txt..." ### Following instructions for ICU maintainers only ### 4. Obtain the current version of tzcodeYYYY.tar.gz from the FTP site to this directory. 5. Run make target "check-dump". This target extract makes the original tzcode and compile the original tzdata with icu supplemental data (icuzones). Then it makes zdump / icuzdump and dump all time transitions for all ICU timezone to files under zdumpout / icuzdumpout directory. When they produce different results, the target returns the error. 6. Don't forget to check in the new zoneinfo.txt (from its location at {path_to}/icu/source/data/misc/zoneinfo.txt) into SVN.