Changeset 915 in tests
- Timestamp:
- 07/19/2012 02:41:52 PM (13 years ago)
- Location:
- trunk
- Files:
-
- 7 added
- 2 deleted
- 1 edited
- 11 copied
Legend:
- Unmodified
- Added
- Removed
-
trunk/data/formatting/utf-8/README
r909 r915 2 2 support is much, much, much, much better than PHP's. 3 3 4 * `generate_remove_accents_tests.py` generates all of the 5 remove_accents_from_* files. 6 7 Call it with `python generate_remove_accents_tests.py` 8 9 * `urlencode.py`, `u-urlencode.py` and `entitize.py` process UTF-8 4 * `utf-8/urlencode.py`, `utf-8/u-urlencode.py` and `utf-8/entitize.py` process UTF-8 10 5 into a few different formats (%-encoding, %u-encoding, &#decimal;) 11 6 and are used like normal UNIXy pipes. 12 7 13 8 Try: 14 15 `python urlencode.py < utf-8.txt > utf-8-urlencoded.txt`16 `python u-urlencode.py < utf-8.txt > utf-8-u-urlencoded.txt`17 `python entitize.py < utf-8.txt > utf-8-entitized.txt`18 19 * I think `windows-1252.py` converts Windows-only smart-quotes20 and things into their unicode &#decimal reference; equivalents.21 9 10 `python urlencode.py < utf-8.txt > urlencoded.txt` 11 `python u-urlencode.py < utf-8.txt > u-urlencoded.txt` 12 `python entitize.py < utf-8.txt > entitized.txt` 22 13 23 24 25 14 * `windows-1252.py` converts Windows-only smart-quotes and things 15 into their unicode &#decimal reference; equivalents. -
trunk/data/formatting/utf-8/entitize.py
r909 r915 1 # Generates entitized.txt from utf-8.txt. 2 # Used by Test_Convert_UrlEncoded_To_Entities. 3 1 4 import codecs 2 5 import sys … … 11 14 args = sys.argv[1:] 12 15 if args and args[0] in ("-h", "--help"): 13 print "Usage: python entitize.py < utf 8-lines.txt > entitized-lines.txt"16 print "Usage: python entitize.py < utf-8.txt > entitized.txt" 14 17 sys.exit(2) 15 18 -
trunk/data/formatting/utf-8/u-urlencode.py
r909 r915 1 # Generates u-urlencoded.txt from utf-8.txt. 2 # Used for Test_Convert_UrlEncoded_To_Entities. 3 1 4 import codecs 2 5 import sys … … 11 14 args = sys.argv[1:] 12 15 if args and args[0] in ("-h", "--help"): 13 print "Usage: python u-urlencode.py < utf 8-lines.txt > u-urlencoded-lines.txt"16 print "Usage: python u-urlencode.py < utf-8.txt > u-urlencoded.txt" 14 17 sys.exit(2) 15 18 -
trunk/data/formatting/utf-8/urlencode.py
r909 r915 1 # Generates test data for the utf8_uri_encode function in formatting.php2 # Pipe UTF-8 data to stdin and accept it from stdout1 # Generates urlencoded.txt from utf-8.txt. 2 # Used for Test_UTF8_URI_Encode. 3 3 4 4 import urllib, codecs, re … … 23 23 args = sys.argv[1:] 24 24 if args and args[0] in ("-h", "--help"): 25 print "Usage: python urlencode.py < utf 8-lines.txt > utf8-urlencoded-lines.txt"25 print "Usage: python urlencode.py < utf-8.txt > urlencoded.txt" 26 26 sys.exit(2) 27 27 -
trunk/tests/formatting/RemoveAccents.php
r909 r915 37 37 public function test_remove_accents_iso8859() { 38 38 // File is Latin1 encoded 39 $file = DIR_TESTDATA . DIRECTORY_SEPARATOR . 'formatting' . DIRECTORY_SEPARATOR . 'remove_accents.01.input.txt';39 $file = DIR_TESTDATA . '/formatting/remove_accents.01.input.txt'; 40 40 $input = file_get_contents( $file ); 41 41 $input = trim( $input );
Note: See TracChangeset
for help on using the changeset viewer.