PHP  
 PHP: Test and Code Coverage Analysis
downloads | QA | documentation | faq | getting help | mailing lists | reporting bugs | php.net sites | links | my php.net 
 

LTP GCOV extension - code coverage report
Current view: directory - mbstring/libmbfl/mbfl - mbfilter.c
Test: PHP Code Coverage
Date: 2009-11-19 Instrumented lines: 1517
Code covered: 61.3 % Executed lines: 930
Legend: not executed executed

       1                 : /*
       2                 :  * charset=UTF-8
       3                 :  * vim600: encoding=utf-8
       4                 :  */
       5                 : 
       6                 : /*
       7                 :  * "streamable kanji code filter and converter"
       8                 :  *
       9                 :  * Copyright (c) 1998,1999,2000,2001 HappySize, Inc. All rights reserved.
      10                 :  *
      11                 :  * This software is released under the GNU Lesser General Public License.
      12                 :  * (Version 2.1, February 1999)
      13                 :  * Please read the following detail of the licence (in japanese).
      14                 :  *
      15                 :  * ◆使用許諾条件◆
      16                 :  *
      17                 :  * このソフトウェアは株式会社ハッピーサイズによって開発されました。株式会社ハッ
      18                 :  * ピーサイズは、著作権法および万国著作権条約の定めにより、このソフトウェアに関
      19                 :  * するすべての権利を留保する権利を持ち、ここに行使します。株式会社ハッピーサイ
      20                 :  * ズは以下に明記した条件に従って、このソフトウェアを使用する排他的ではない権利
      21                 :  * をお客様に許諾します。何人たりとも、以下の条件に反してこのソフトウェアを使用
      22                 :  * することはできません。
      23                 :  *
      24                 :  * このソフトウェアを「GNU Lesser General Public License (Version 2.1, February
      25                 :  * 1999)」に示された条件で使用することを、全ての方に許諾します。「GNU Lesser
      26                 :  * General Public License」を満たさない使用には、株式会社ハッピーサイズから書面
      27                 :  * による許諾を得る必要があります。
      28                 :  *
      29                 :  * 「GNU Lesser General Public License」の全文は以下のウェブページから取得でき
      30                 :  * ます。「GNU Lesser General Public License」とは、これまでLibrary General
      31                 :  * Public Licenseと呼ばれていたものです。
      32                 :  *     http://www.gnu.org/ --- GNUウェブサイト
      33                 :  *     http://www.gnu.org/copyleft/lesser.html --- ライセンス文面
      34                 :  * このライセンスの内容がわからない方、守れない方には使用を許諾しません。
      35                 :  *
      36                 :  * しかしながら、当社とGNUプロジェクトとの特定の関係を示唆または主張するもので
      37                 :  * はありません。
      38                 :  *
      39                 :  * ◆保証内容◆
      40                 :  *
      41                 :  * このソフトウェアは、期待された動作・機能・性能を持つことを目標として設計され
      42                 :  * 開発されていますが、これを保証するものではありません。このソフトウェアは「こ
      43                 :  * のまま」の状態で提供されており、たとえばこのソフトウェアの有用性ないし特定の
      44                 :  * 目的に合致することといった、何らかの保証内容が、明示されたり暗黙に示されてい
      45                 :  * る場合であっても、その保証は無効です。このソフトウェアを使用した結果ないし使
      46                 :  * 用しなかった結果によって、直接あるいは間接に受けた身体的な傷害、財産上の損害
      47                 :  * 、データの損失あるいはその他の全ての損害については、その損害の可能性が使用者
      48                 :  * 、当社あるいは第三者によって警告されていた場合であっても、当社はその損害の賠
      49                 :  * 償および補填を行いません。この規定は他の全ての、書面上または書面に無い保証・
      50                 :  * 契約・規定に優先します。
      51                 :  *
      52                 :  * ◆著作権者の連絡先および使用条件についての問い合わせ先◆
      53                 :  *
      54                 :  * 〒102-0073
      55                 :  * 東京都千代田区九段北1-13-5日本地所第一ビル4F
      56                 :  * 株式会社ハッピーサイズ
      57                 :  * Phone: 03-3512-3655, Fax: 03-3512-3656
      58                 :  * Email: sales@happysize.co.jp
      59                 :  * Web: http://happysize.com/
      60                 :  *
      61                 :  * ◆著者◆
      62                 :  *
      63                 :  * 金本 茂 <sgk@happysize.co.jp>
      64                 :  *
      65                 :  * ◆履歴◆
      66                 :  *
      67                 :  * 1998/11/10 sgk implementation in C++
      68                 :  * 1999/4/25  sgk Cで書きなおし。
      69                 :  * 1999/4/26  sgk 入力フィルタを実装。漢字コードを推定しながらフィルタを追加。
      70                 :  * 1999/6/??      Unicodeサポート。
      71                 :  * 1999/6/22  sgk ライセンスをLGPLに変更。
      72                 :  *
      73                 :  */
      74                 : 
      75                 : /* 
      76                 :  * Unicode support
      77                 :  *
      78                 :  * Portions copyright (c) 1999,2000,2001 by the PHP3 internationalization team.
      79                 :  * All rights reserved.
      80                 :  *
      81                 :  */
      82                 : 
      83                 : 
      84                 : #ifdef HAVE_CONFIG_H
      85                 : #include "config.h"
      86                 : #endif
      87                 : 
      88                 : #include <stddef.h>
      89                 : 
      90                 : #ifdef HAVE_STRING_H
      91                 : #include <string.h>
      92                 : #endif
      93                 : 
      94                 : #ifdef HAVE_STRINGS_H
      95                 : #include <strings.h>
      96                 : #endif
      97                 : 
      98                 : #ifdef HAVE_STDDEF_H
      99                 : #include <stddef.h>
     100                 : #endif
     101                 : 
     102                 : #include "mbfilter.h"
     103                 : #include "mbfl_filter_output.h"
     104                 : #include "mbfilter_pass.h"
     105                 : 
     106                 : #include "eaw_table.h"
     107                 : 
     108                 : /* hex character table "0123456789ABCDEF" */
     109                 : static char mbfl_hexchar_table[] = {
     110                 :         0x30,0x31,0x32,0x33,0x34,0x35,0x36,0x37,0x38,0x39,0x41,0x42,0x43,0x44,0x45,0x46
     111                 : };
     112                 : 
     113                 : 
     114                 : 
     115                 : /*
     116                 :  * encoding filter
     117                 :  */
     118                 : #define CK(statement)   do { if ((statement) < 0) return (-1); } while (0)
     119                 : 
     120                 : 
     121                 : /*
     122                 :  *  buffering converter
     123                 :  */
     124                 : mbfl_buffer_converter *
     125                 : mbfl_buffer_converter_new(
     126                 :     enum mbfl_no_encoding from,
     127                 :     enum mbfl_no_encoding to,
     128                 :     int buf_initsz)
     129            3174 : {
     130                 :         mbfl_buffer_converter *convd;
     131                 : 
     132                 :         /* allocate */
     133            3174 :         convd = (mbfl_buffer_converter*)mbfl_malloc(sizeof (mbfl_buffer_converter));
     134            3174 :         if (convd == NULL) {
     135               0 :                 return NULL;
     136                 :         }
     137                 : 
     138                 :         /* initialize */
     139            3174 :         convd->from = mbfl_no2encoding(from);
     140            3174 :         convd->to = mbfl_no2encoding(to);
     141            3174 :         if (convd->from == NULL) {
     142               0 :                 convd->from = &mbfl_encoding_pass;
     143                 :         }
     144            3174 :         if (convd->to == NULL) {
     145               0 :                 convd->to = &mbfl_encoding_pass;
     146                 :         }
     147                 : 
     148                 :         /* create convert filter */
     149            3174 :         convd->filter1 = NULL;
     150            3174 :         convd->filter2 = NULL;
     151            3174 :         if (mbfl_convert_filter_get_vtbl(convd->from->no_encoding, convd->to->no_encoding) != NULL) {
     152               0 :                 convd->filter1 = mbfl_convert_filter_new(convd->from->no_encoding, convd->to->no_encoding, mbfl_memory_device_output, 0, &convd->device);
     153                 :         } else {
     154            3174 :                 convd->filter2 = mbfl_convert_filter_new(mbfl_no_encoding_wchar, convd->to->no_encoding, mbfl_memory_device_output, 0, &convd->device);
     155            3174 :                 if (convd->filter2 != NULL) {
     156            3174 :                         convd->filter1 = mbfl_convert_filter_new(convd->from->no_encoding, mbfl_no_encoding_wchar, (int (*)(int, void*))convd->filter2->filter_function, NULL, convd->filter2);
     157            3174 :                         if (convd->filter1 == NULL) {
     158               0 :                                 mbfl_convert_filter_delete(convd->filter2);
     159                 :                         }
     160                 :                 }
     161                 :         }
     162            3174 :         if (convd->filter1 == NULL) {
     163               0 :                 return NULL;
     164                 :         }
     165                 : 
     166            3174 :         mbfl_memory_device_init(&convd->device, buf_initsz, buf_initsz/4);
     167                 : 
     168            3174 :         return convd;
     169                 : }
     170                 : 
     171                 : void
     172                 : mbfl_buffer_converter_delete(mbfl_buffer_converter *convd)
     173            3174 : {
     174            3174 :         if (convd != NULL) {
     175            3174 :                 if (convd->filter1) {
     176            3174 :                         mbfl_convert_filter_delete(convd->filter1);
     177                 :                 }
     178            3174 :                 if (convd->filter2) {
     179            3174 :                         mbfl_convert_filter_delete(convd->filter2);
     180                 :                 }
     181            3174 :                 mbfl_memory_device_clear(&convd->device);
     182            3174 :                 mbfl_free((void*)convd);
     183                 :         }
     184            3174 : }
     185                 : 
     186                 : void
     187                 : mbfl_buffer_converter_reset(mbfl_buffer_converter *convd)
     188               0 : {
     189               0 :         mbfl_memory_device_reset(&convd->device);
     190               0 : }
     191                 : 
     192                 : int
     193                 : mbfl_buffer_converter_illegal_mode(mbfl_buffer_converter *convd, int mode)
     194            3174 : {
     195            3174 :         if (convd != NULL) {
     196            3174 :                 if (convd->filter2 != NULL) {
     197            3174 :                         convd->filter2->illegal_mode = mode;
     198               0 :                 } else if (convd->filter1 != NULL) {
     199               0 :                         convd->filter1->illegal_mode = mode;
     200                 :                 } else {
     201               0 :                         return 0;
     202                 :                 }
     203                 :         }
     204                 : 
     205            3174 :         return 1;
     206                 : }
     207                 : 
     208                 : int
     209                 : mbfl_buffer_converter_illegal_substchar(mbfl_buffer_converter *convd, int substchar)
     210            3174 : {
     211            3174 :         if (convd != NULL) {
     212            3174 :                 if (convd->filter2 != NULL) {
     213            3174 :                         convd->filter2->illegal_substchar = substchar;
     214               0 :                 } else if (convd->filter1 != NULL) {
     215               0 :                         convd->filter1->illegal_substchar = substchar;
     216                 :                 } else {
     217               0 :                         return 0;
     218                 :                 }
     219                 :         }
     220                 : 
     221            3174 :         return 1;
     222                 : }
     223                 : 
     224                 : int
     225                 : mbfl_buffer_converter_strncat(mbfl_buffer_converter *convd, const unsigned char *p, int n)
     226               0 : {
     227                 :         mbfl_convert_filter *filter;
     228                 :         int (*filter_function)(int c, mbfl_convert_filter *filter);
     229                 : 
     230               0 :         if (convd != NULL && p != NULL) {
     231               0 :                 filter = convd->filter1;
     232               0 :                 if (filter != NULL) {
     233               0 :                         filter_function = filter->filter_function;
     234               0 :                         while (n > 0) {
     235               0 :                                 if ((*filter_function)(*p++, filter) < 0) {
     236               0 :                                         break;
     237                 :                                 }
     238               0 :                                 n--;
     239                 :                         }
     240                 :                 }
     241                 :         }
     242                 : 
     243               0 :         return n;
     244                 : }
     245                 : 
     246                 : int
     247                 : mbfl_buffer_converter_feed(mbfl_buffer_converter *convd, mbfl_string *string)
     248            3194 : {
     249                 :         int n;
     250                 :         unsigned char *p;
     251                 :         mbfl_convert_filter *filter;
     252                 :         int (*filter_function)(int c, mbfl_convert_filter *filter);
     253                 : 
     254            3194 :         if (convd == NULL || string == NULL) {
     255               0 :                 return -1;
     256                 :         }
     257            3194 :         mbfl_memory_device_realloc(&convd->device, convd->device.pos + string->len, string->len/4);
     258                 :         /* feed data */
     259            3194 :         n = string->len;
     260            3194 :         p = string->val;
     261            3194 :         filter = convd->filter1;
     262            3194 :         if (filter != NULL) {
     263            3194 :                 filter_function = filter->filter_function;
     264           75171 :                 while (n > 0) {
     265           68783 :                         if ((*filter_function)(*p++, filter) < 0) {
     266               0 :                                 return -1;
     267                 :                         }
     268           68783 :                         n--;
     269                 :                 }
     270                 :         }
     271                 : 
     272            3194 :         return 0;
     273                 : }
     274                 : 
     275                 : int
     276                 : mbfl_buffer_converter_flush(mbfl_buffer_converter *convd)
     277               5 : {
     278               5 :         if (convd == NULL) {
     279               0 :                 return -1;
     280                 :         }
     281                 : 
     282               5 :         if (convd->filter1 != NULL) {
     283               5 :                 mbfl_convert_filter_flush(convd->filter1);
     284                 :         }
     285               5 :         if (convd->filter2 != NULL) {
     286               5 :                 mbfl_convert_filter_flush(convd->filter2);
     287                 :         }
     288                 : 
     289               5 :         return 0;
     290                 : }
     291                 : 
     292                 : mbfl_string *
     293                 : mbfl_buffer_converter_getbuffer(mbfl_buffer_converter *convd, mbfl_string *result)
     294               0 : {
     295               0 :         if (convd != NULL && result != NULL && convd->device.buffer != NULL) {
     296               0 :                 result->no_encoding = convd->to->no_encoding;
     297               0 :                 result->val = convd->device.buffer;
     298               0 :                 result->len = convd->device.pos;
     299                 :         } else {
     300               0 :                 result = NULL;
     301                 :         }
     302                 : 
     303               0 :         return result;
     304                 : }
     305                 : 
     306                 : mbfl_string *
     307                 : mbfl_buffer_converter_result(mbfl_buffer_converter *convd, mbfl_string *result)
     308               5 : {
     309               5 :         if (convd == NULL || result == NULL) {
     310               0 :                 return NULL;
     311                 :         }
     312               5 :         result->no_encoding = convd->to->no_encoding;
     313               5 :         return mbfl_memory_device_result(&convd->device, result);
     314                 : }
     315                 : 
     316                 : mbfl_string *
     317                 : mbfl_buffer_converter_feed_result(mbfl_buffer_converter *convd, mbfl_string *string, 
     318                 :                                   mbfl_string *result)
     319            3189 : {
     320            3189 :         if (convd == NULL || string == NULL || result == NULL) {
     321               0 :                 return NULL;
     322                 :         }
     323            3189 :         mbfl_buffer_converter_feed(convd, string);
     324            3189 :         if (convd->filter1 != NULL) {
     325            3189 :                 mbfl_convert_filter_flush(convd->filter1);
     326                 :         }
     327            3189 :         if (convd->filter2 != NULL) {
     328            3189 :                 mbfl_convert_filter_flush(convd->filter2);
     329                 :         }
     330            3189 :         result->no_encoding = convd->to->no_encoding;
     331            3189 :         return mbfl_memory_device_result(&convd->device, result);
     332                 : }
     333                 : 
     334                 : int mbfl_buffer_illegalchars(mbfl_buffer_converter *convd)
     335            3174 : {
     336            3174 :         int num_illegalchars = 0;
     337                 : 
     338            3174 :         if (convd == NULL) {
     339               0 :                 return 0;
     340                 :         }
     341                 : 
     342            3174 :         if (convd->filter1 != NULL) {
     343            3174 :                 num_illegalchars += convd->filter1->num_illegalchar;
     344                 :         }
     345                 : 
     346            3174 :         if (convd->filter2 != NULL) {
     347            3174 :                 num_illegalchars += convd->filter2->num_illegalchar;
     348                 :         }
     349                 : 
     350            3174 :         return (num_illegalchars);
     351                 : }
     352                 : 
     353                 : /*
     354                 :  * encoding detector
     355                 :  */
     356                 : mbfl_encoding_detector *
     357                 : mbfl_encoding_detector_new(enum mbfl_no_encoding *elist, int elistsz, int strict)
     358               6 : {
     359                 :         mbfl_encoding_detector *identd;
     360                 : 
     361                 :         int i, num;
     362                 :         mbfl_identify_filter *filter;
     363                 : 
     364               6 :         if (elist == NULL || elistsz <= 0) {
     365               0 :                 return NULL;
     366                 :         }
     367                 : 
     368                 :         /* allocate */
     369               6 :         identd = (mbfl_encoding_detector*)mbfl_malloc(sizeof(mbfl_encoding_detector));
     370               6 :         if (identd == NULL) {
     371               0 :                 return NULL;
     372                 :         }
     373               6 :         identd->filter_list = (mbfl_identify_filter **)mbfl_calloc(elistsz, sizeof(mbfl_identify_filter *));
     374               6 :         if (identd->filter_list == NULL) {
     375               0 :                 mbfl_free(identd);
     376               0 :                 return NULL;
     377                 :         }
     378                 : 
     379                 :         /* create filters */
     380               6 :         i = 0;
     381               6 :         num = 0;
     382              42 :         while (i < elistsz) {
     383              30 :                 filter = mbfl_identify_filter_new(elist[i]);
     384              30 :                 if (filter != NULL) {
     385              30 :                         identd->filter_list[num] = filter;
     386              30 :                         num++;
     387                 :                 }
     388              30 :                 i++;
     389                 :         }
     390               6 :         identd->filter_list_size = num;
     391                 : 
     392                 :         /* set strict flag */
     393               6 :         identd->strict = strict;
     394                 : 
     395               6 :         return identd;
     396                 : }
     397                 : 
     398                 : void
     399                 : mbfl_encoding_detector_delete(mbfl_encoding_detector *identd)
     400               6 : {
     401                 :         int i;
     402                 : 
     403               6 :         if (identd != NULL) {
     404               6 :                 if (identd->filter_list != NULL) {
     405               6 :                         i = identd->filter_list_size;
     406              42 :                         while (i > 0) {
     407              30 :                                 i--;
     408              30 :                                 mbfl_identify_filter_delete(identd->filter_list[i]);
     409                 :                         }
     410               6 :                         mbfl_free((void *)identd->filter_list);
     411                 :                 }
     412               6 :                 mbfl_free((void *)identd);
     413                 :         }
     414               6 : }
     415                 : 
     416                 : int
     417                 : mbfl_encoding_detector_feed(mbfl_encoding_detector *identd, mbfl_string *string)
     418               6 : {
     419                 :         int i, n, num, bad, res;
     420                 :         unsigned char *p;
     421                 :         mbfl_identify_filter *filter;
     422                 : 
     423               6 :         res = 0;
     424                 :         /* feed data */
     425               6 :         if (identd != NULL && string != NULL && string->val != NULL) {
     426               6 :                 num = identd->filter_list_size;
     427               6 :                 n = string->len;
     428               6 :                 p = string->val;
     429               6 :                 bad = 0;
     430              18 :                 while (n > 0) {
     431              72 :                         for (i = 0; i < num; i++) {
     432              60 :                                 filter = identd->filter_list[i];
     433              60 :                                 if (!filter->flag) {
     434              48 :                                         (*filter->filter_function)(*p, filter);
     435              48 :                                         if (filter->flag) {
     436              24 :                                                 bad++;
     437                 :                                         }
     438                 :                                 }
     439                 :                         }
     440              12 :                         if ((num - 1) <= bad) {
     441               6 :                                 res = 1;
     442               6 :                                 break;
     443                 :                         }
     444               6 :                         p++;
     445               6 :                         n--;
     446                 :                 }
     447                 :         }
     448                 : 
     449               6 :         return res;
     450                 : }
     451                 : 
     452                 : enum mbfl_no_encoding mbfl_encoding_detector_judge(mbfl_encoding_detector *identd)
     453               6 : {
     454                 :         mbfl_identify_filter *filter;
     455                 :         enum mbfl_no_encoding encoding;
     456                 :         int n;
     457                 : 
     458                 :         /* judge */
     459               6 :         encoding = mbfl_no_encoding_invalid;
     460               6 :         if (identd != NULL) {
     461               6 :                 n = identd->filter_list_size - 1;
     462              42 :                 while (n >= 0) {
     463              30 :                         filter = identd->filter_list[n];
     464              30 :                         if (!filter->flag) {
     465               6 :                                 if (!identd->strict || !filter->status) {
     466               6 :                                         encoding = filter->encoding->no_encoding;
     467                 :                                 }
     468                 :                         }
     469              30 :                         n--;
     470                 :                 }
     471                 : 
     472               6 :                 if (encoding == mbfl_no_encoding_invalid) {
     473               0 :                         n = identd->filter_list_size - 1;
     474               0 :                         while (n >= 0) {
     475               0 :                                 filter = identd->filter_list[n];
     476               0 :                                 if (!filter->flag) {
     477               0 :                                         encoding = filter->encoding->no_encoding;
     478                 :                                 }
     479               0 :                                 n--;
     480                 :                         }
     481                 :                 }
     482                 :         }
     483                 : 
     484               6 :         return encoding;
     485                 : }
     486                 : 
     487                 : 
     488                 : /*
     489                 :  * encoding converter
     490                 :  */
     491                 : mbfl_string *
     492                 : mbfl_convert_encoding(
     493                 :     mbfl_string *string,
     494                 :     mbfl_string *result,
     495                 :     enum mbfl_no_encoding toenc)
     496             160 : {
     497                 :         int n;
     498                 :         unsigned char *p;
     499                 :         const mbfl_encoding *encoding;
     500                 :         mbfl_memory_device device;
     501                 :         mbfl_convert_filter *filter1;
     502                 :         mbfl_convert_filter *filter2;
     503                 : 
     504                 :         /* initialize */
     505             160 :         encoding = mbfl_no2encoding(toenc);
     506             160 :         if (encoding == NULL || string == NULL || result == NULL) {
     507               0 :                 return NULL;
     508                 :         }
     509                 : 
     510             160 :         filter1 = NULL;
     511             160 :         filter2 = NULL;
     512             160 :         if (mbfl_convert_filter_get_vtbl(string->no_encoding, toenc) != NULL) {
     513              11 :                 filter1 = mbfl_convert_filter_new(string->no_encoding, toenc, mbfl_memory_device_output, 0, &device);
     514                 :         } else {
     515             149 :                 filter2 = mbfl_convert_filter_new(mbfl_no_encoding_wchar, toenc, mbfl_memory_device_output, 0, &device);
     516             149 :                 if (filter2 != NULL) {
     517             149 :                         filter1 = mbfl_convert_filter_new(string->no_encoding, mbfl_no_encoding_wchar, (int (*)(int, void*))filter2->filter_function, NULL, filter2);
     518             149 :                         if (filter1 == NULL) {
     519               0 :                                 mbfl_convert_filter_delete(filter2);
     520                 :                         }
     521                 :                 }
     522                 :         }
     523             160 :         if (filter1 == NULL) {
     524               0 :                 return NULL;
     525                 :         }
     526                 : 
     527             160 :         if (filter2 != NULL) {
     528             149 :                 filter2->illegal_mode = MBFL_OUTPUTFILTER_ILLEGAL_MODE_CHAR;
     529             149 :                 filter2->illegal_substchar = 0x3f;           /* '?' */
     530                 :         }
     531                 : 
     532             160 :         mbfl_memory_device_init(&device, string->len, (string->len >> 2) + 8);
     533                 : 
     534                 :         /* feed data */
     535             160 :         n = string->len;
     536             160 :         p = string->val;
     537             160 :         if (p != NULL) {
     538            4780 :                 while (n > 0) {
     539            4460 :                         if ((*filter1->filter_function)(*p++, filter1) < 0) {
     540               0 :                                 break;
     541                 :                         }
     542            4460 :                         n--;
     543                 :                 }
     544                 :         }
     545                 : 
     546             160 :         mbfl_convert_filter_flush(filter1);
     547             160 :         mbfl_convert_filter_delete(filter1);
     548             160 :         if (filter2 != NULL) {
     549             149 :                 mbfl_convert_filter_flush(filter2);
     550             149 :                 mbfl_convert_filter_delete(filter2);
     551                 :         }
     552                 : 
     553             160 :         return mbfl_memory_device_result(&device, result);
     554                 : }
     555                 : 
     556                 : 
     557                 : /*
     558                 :  * identify encoding
     559                 :  */
     560                 : const mbfl_encoding *
     561                 : mbfl_identify_encoding(mbfl_string *string, enum mbfl_no_encoding *elist, int elistsz, int strict)
     562              28 : {
     563                 :         int i, n, num, bad;
     564                 :         unsigned char *p;
     565                 :         mbfl_identify_filter *flist, *filter;
     566                 :         const mbfl_encoding *encoding;
     567                 : 
     568                 :         /* flist is an array of mbfl_identify_filter instances */
     569              28 :         flist = (mbfl_identify_filter *)mbfl_calloc(elistsz, sizeof(mbfl_identify_filter));
     570              28 :         if (flist == NULL) {
     571               0 :                 return NULL;
     572                 :         }
     573                 : 
     574              28 :         num = 0;
     575              28 :         if (elist != NULL) {
     576             116 :                 for (i = 0; i < elistsz; i++) {
     577              88 :                         if (!mbfl_identify_filter_init(&flist[num], elist[i])) {
     578              88 :                                 num++;
     579                 :                         }
     580                 :                 }
     581                 :         }
     582                 : 
     583                 :         /* feed data */
     584              28 :         n = string->len;
     585              28 :         p = string->val;
     586                 : 
     587              28 :         if (p != NULL) {
     588              28 :                 bad = 0;
     589             331 :                 while (n > 0) {
     590            1509 :                         for (i = 0; i < num; i++) {
     591            1216 :                                 filter = &flist[i];
     592            1216 :                                 if (!filter->flag) {
     593            1106 :                                         (*filter->filter_function)(*p, filter);
     594            1106 :                                         if (filter->flag) {
     595              44 :                                                 bad++;
     596                 :                                         }
     597                 :                                 }
     598                 :                         }
     599             293 :                         if ((num - 1) <= bad && !strict) {
     600              18 :                                 break;
     601                 :                         }
     602             275 :                         p++;
     603             275 :                         n--;
     604                 :                 }
     605                 :         }
     606                 : 
     607                 :         /* judge */
     608              28 :         encoding = NULL;
     609                 : 
     610              61 :         for (i = 0; i < num; i++) {
     611              59 :                 filter = &flist[i];
     612              59 :                 if (!filter->flag) {
     613              27 :                         if (strict && filter->status) {
     614               1 :                                 continue;
     615                 :                         }
     616              26 :                         encoding = filter->encoding;
     617              26 :                         break;
     618                 :                 }
     619                 :         }
     620                 : 
     621                 :         /* fall-back judge */
     622              28 :         if (!encoding) {
     623               4 :                 for (i = 0; i < num; i++) {
     624               2 :                         filter = &flist[i];
     625               2 :                         if (!filter->flag && (!strict || !filter->status)) {
     626               0 :                                 encoding = filter->encoding;
     627               0 :                                 break;
     628                 :                         }
     629                 :                 }
     630                 :         }
     631                 : 
     632                 :         /* cleanup */
     633                 :         /* dtors should be called in reverse order */
     634             144 :         i = num; while (--i >= 0) {
     635              88 :                 mbfl_identify_filter_cleanup(&flist[i]);
     636                 :         }
     637                 : 
     638              28 :         mbfl_free((void *)flist);
     639                 : 
     640              28 :         return encoding;
     641                 : }
     642                 : 
     643                 : const char*
     644                 : mbfl_identify_encoding_name(mbfl_string *string, enum mbfl_no_encoding *elist, int elistsz, int strict)
     645              17 : {
     646                 :         const mbfl_encoding *encoding;
     647                 : 
     648              17 :         encoding = mbfl_identify_encoding(string, elist, elistsz, strict);
     649              17 :         if (encoding != NULL &&
     650                 :             encoding->no_encoding > mbfl_no_encoding_charset_min &&
     651                 :             encoding->no_encoding < mbfl_no_encoding_charset_max) {
     652              15 :                 return encoding->name;
     653                 :         } else {
     654               2 :                 return NULL;
     655                 :         }
     656                 : }
     657                 : 
     658                 : enum mbfl_no_encoding
     659                 : mbfl_identify_encoding_no(mbfl_string *string, enum mbfl_no_encoding *elist, int elistsz, int strict)
     660              11 : {
     661                 :         const mbfl_encoding *encoding;
     662                 : 
     663              11 :         encoding = mbfl_identify_encoding(string, elist, elistsz, strict);
     664              11 :         if (encoding != NULL &&
     665                 :             encoding->no_encoding > mbfl_no_encoding_charset_min &&
     666                 :             encoding->no_encoding < mbfl_no_encoding_charset_max) {
     667              11 :                 return encoding->no_encoding;
     668                 :         } else {
     669               0 :                 return mbfl_no_encoding_invalid;
     670                 :         }
     671                 : }
     672                 : 
     673                 : 
     674                 : /*
     675                 :  *  strlen
     676                 :  */
     677                 : static int
     678                 : filter_count_output(int c, void *data)
     679             611 : {
     680             611 :         (*(int *)data)++;
     681             611 :         return c;
     682                 : }
     683                 : 
     684                 : int
     685                 : mbfl_strlen(mbfl_string *string)
     686            1120 : {
     687                 :         int len, n, m, k;
     688                 :         unsigned char *p;
     689                 :         const unsigned char *mbtab;
     690                 :         const mbfl_encoding *encoding;
     691                 : 
     692            1120 :         encoding = mbfl_no2encoding(string->no_encoding);
     693            1120 :         if (encoding == NULL || string == NULL) {
     694               0 :                 return -1;
     695                 :         }
     696                 : 
     697            1120 :         len = 0;
     698            1120 :         if (encoding->flag & MBFL_ENCTYPE_SBCS) {
     699              69 :                 len = string->len;
     700            1051 :         } else if (encoding->flag & (MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_WCS2LE)) {
     701               6 :                 len = string->len/2;
     702            1045 :         } else if (encoding->flag & (MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_WCS4LE)) {
     703              12 :                 len = string->len/4;
     704            1033 :         } else if (encoding->mblen_table != NULL) {
     705            1012 :                 mbtab = encoding->mblen_table;
     706            1012 :                 n = 0;
     707            1012 :                 p = string->val;
     708            1012 :                 k = string->len;
     709                 :                 /* count */
     710            1012 :                 if (p != NULL) {
     711           17252 :                         while (n < k) {
     712           15228 :                                 m = mbtab[*p];
     713           15228 :                                 n += m;
     714           15228 :                                 p += m;
     715           15228 :                                 len++;
     716                 :                         };
     717                 :                 }
     718                 :         } else {
     719                 :                 /* wchar filter */
     720                 :                 mbfl_convert_filter *filter = mbfl_convert_filter_new(
     721                 :                   string->no_encoding, 
     722                 :                   mbfl_no_encoding_wchar,
     723              21 :                   filter_count_output, 0, &len);
     724              21 :                 if (filter == NULL) {
     725               0 :                         return -1;
     726                 :                 }
     727                 :                 /* count */
     728              21 :                 n = string->len;
     729              21 :                 p = string->val;
     730              21 :                 if (p != NULL) {
     731             802 :                         while (n > 0) {
     732             760 :                                 (*filter->filter_function)(*p++, filter);
     733             760 :                                 n--;
     734                 :                         }
     735                 :                 }
     736              21 :                 mbfl_convert_filter_delete(filter);
     737                 :         }
     738                 : 
     739            1120 :         return len;
     740                 : }
     741                 : 
     742                 :  
     743                 : /*
     744                 :  *  strpos
     745                 :  */
     746                 : struct collector_strpos_data {
     747                 :         mbfl_convert_filter *next_filter;
     748                 :         mbfl_wchar_device needle;
     749                 :         int needle_len;
     750                 :         int start;
     751                 :         int output;
     752                 :         int found_pos;
     753                 :         int needle_pos;
     754                 :         int matched_pos;
     755                 : };
     756                 : 
     757                 : static int
     758                 : collector_strpos(int c, void* data)
     759            4417 : {
     760                 :         int *p, *h, *m, n;
     761            4417 :         struct collector_strpos_data *pc = (struct collector_strpos_data*)data;
     762                 : 
     763            4417 :         if (pc->output >= pc->start) {
     764            4417 :                 if (c == (int)pc->needle.buffer[pc->needle_pos]) {
     765            2480 :                         if (pc->needle_pos == 0) {
     766            1175 :                                 pc->found_pos = pc->output;                       /* found position */
     767                 :                         }
     768            2480 :                         pc->needle_pos++;                                            /* needle pointer */
     769            2480 :                         if (pc->needle_pos >= pc->needle_len) {
     770             647 :                                 pc->matched_pos = pc->found_pos;  /* matched position */
     771             647 :                                 pc->needle_pos--;
     772             647 :                                 goto retry;
     773                 :                         }
     774            1937 :                 } else if (pc->needle_pos != 0) {
     775            1168 : retry:
     776            1168 :                         h = (int *)pc->needle.buffer;
     777            1168 :                         h++;
     778                 :                         for (;;) {
     779            1822 :                                 pc->found_pos++;
     780            1822 :                                 p = h;
     781            1822 :                                 m = (int *)pc->needle.buffer;
     782            1822 :                                 n = pc->needle_pos - 1;
     783            3650 :                                 while (n > 0 && *p == *m) {
     784               6 :                                         n--;
     785               6 :                                         p++;
     786               6 :                                         m++;
     787                 :                                 }
     788            1822 :                                 if (n <= 0) {
     789            1168 :                                         if (*m != c) {
     790            1155 :                                                 pc->needle_pos = 0;
     791                 :                                         }
     792            1168 :                                         break;
     793                 :                                 } else {
     794             654 :                                         h++;
     795             654 :                                         pc->needle_pos--;
     796                 :                                 }
     797             654 :                         }
     798                 :                 }
     799                 :         }
     800                 : 
     801            4417 :         pc->output++;
     802            4417 :         return c;
     803                 : }
     804                 : 
     805                 : /*
     806                 :  *      oddlen
     807                 :  */
     808                 : int 
     809                 : mbfl_oddlen(mbfl_string *string)
     810               0 : {
     811                 :         int len, n, m, k;
     812                 :         unsigned char *p;
     813                 :         const unsigned char *mbtab;
     814                 :         const mbfl_encoding *encoding;
     815                 : 
     816                 : 
     817               0 :         if (string == NULL) {
     818               0 :                 return -1;
     819                 :         }
     820               0 :         encoding = mbfl_no2encoding(string->no_encoding);
     821               0 :         if (encoding == NULL) {
     822               0 :                 return -1;
     823                 :         }
     824                 : 
     825               0 :         len = 0;
     826               0 :         if (encoding->flag & MBFL_ENCTYPE_SBCS) {
     827               0 :                 return 0;
     828               0 :         } else if (encoding->flag & (MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_WCS2LE)) {
     829               0 :                 return len % 2;
     830               0 :         } else if (encoding->flag & (MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_WCS4LE)) {
     831               0 :                 return len % 4;
     832               0 :         } else if (encoding->mblen_table != NULL) {
     833               0 :                 mbtab = encoding->mblen_table;
     834               0 :                 n = 0;
     835               0 :                 p = string->val;
     836               0 :                 k = string->len;
     837                 :                 /* count */
     838               0 :                 if (p != NULL) {
     839               0 :                         while (n < k) {
     840               0 :                                 m = mbtab[*p];
     841               0 :                                 n += m;
     842               0 :                                 p += m;
     843                 :                         };
     844                 :                 }
     845               0 :                 return n-k;
     846                 :         } else {
     847                 :                 /* how can i do ? */
     848               0 :                 return 0;
     849                 :         }
     850                 :         /* NOT REACHED */
     851                 : }
     852                 : 
     853                 : int
     854                 : mbfl_strpos(
     855                 :     mbfl_string *haystack,
     856                 :     mbfl_string *needle,
     857                 :     int offset,
     858                 :     int reverse)
     859             785 : {
     860                 :         int result;
     861                 :         mbfl_string _haystack_u8, _needle_u8;
     862                 :         const mbfl_string *haystack_u8, *needle_u8;
     863                 :         const unsigned char *u8_tbl;
     864                 : 
     865             785 :         if (haystack == NULL || haystack->val == NULL || needle == NULL || needle->val == NULL) {
     866               0 :                 return -8;
     867                 :         }
     868                 : 
     869                 :         {
     870                 :                 const mbfl_encoding *u8_enc;
     871             785 :                 u8_enc = mbfl_no2encoding(mbfl_no_encoding_utf8);
     872             785 :                 if (u8_enc == NULL || u8_enc->mblen_table == NULL) {
     873               0 :                         return -8;
     874                 :                 }
     875             785 :                 u8_tbl = u8_enc->mblen_table;
     876                 :         }
     877                 : 
     878             785 :         if (haystack->no_encoding != mbfl_no_encoding_utf8) {
     879              66 :                 mbfl_string_init(&_haystack_u8);
     880              66 :                 haystack_u8 = mbfl_convert_encoding(haystack, &_haystack_u8, mbfl_no_encoding_utf8);
     881              66 :                 if (haystack_u8 == NULL) {
     882               0 :                         result = -4;
     883               0 :                         goto out;
     884                 :                 }
     885                 :         } else {
     886             719 :                 haystack_u8 = haystack;
     887                 :         }
     888                 : 
     889             785 :         if (needle->no_encoding != mbfl_no_encoding_utf8) {
     890              66 :                 mbfl_string_init(&_needle_u8);
     891              66 :                 needle_u8 = mbfl_convert_encoding(needle, &_needle_u8, mbfl_no_encoding_utf8);
     892              66 :                 if (needle_u8 == NULL) {
     893               0 :                         result = -4;
     894               0 :                         goto out;
     895                 :                 }
     896                 :         } else {
     897             719 :                 needle_u8 = needle;
     898                 :         }
     899                 : 
     900             785 :         if (needle_u8->len < 1) {
     901               0 :                 result = -8;
     902               0 :                 goto out;
     903                 :         }
     904                 : 
     905             785 :         result = -1;
     906             785 :         if (haystack_u8->len < needle_u8->len) {
     907             116 :                 goto out;
     908                 :         }
     909                 : 
     910             669 :         if (!reverse) {
     911                 :                 unsigned int jtbl[1 << (sizeof(unsigned char) * 8)];
     912             354 :                 unsigned int needle_u8_len = needle_u8->len;
     913                 :                 unsigned int i;
     914                 :                 const unsigned char *p, *q, *e;
     915             354 :                 const unsigned char *haystack_u8_val = haystack_u8->val,
     916             354 :                                     *needle_u8_val = needle_u8->val;
     917           90978 :                 for (i = 0; i < sizeof(jtbl) / sizeof(*jtbl); ++i) {
     918           90624 :                         jtbl[i] = needle_u8_len + 1;
     919                 :                 }
     920            1389 :                 for (i = 0; i < needle_u8_len - 1; ++i) {
     921            1035 :                         jtbl[needle_u8_val[i]] = needle_u8_len - i;
     922                 :                 }
     923             354 :                 e = haystack_u8_val + haystack_u8->len;
     924             354 :                 p = haystack_u8_val;
     925            1556 :                 while (--offset >= 0) {
     926             848 :                         if (p >= e) {
     927               0 :                                 result = -16;
     928               0 :                                 goto out;
     929                 :                         }
     930             848 :                         p += u8_tbl[*p];
     931                 :                 }
     932             354 :                 p += needle_u8_len;
     933             354 :                 if (p > e) {
     934               8 :                         goto out;
     935                 :                 }
     936            2997 :                 while (p <= e) {
     937            2547 :                         const unsigned char *pv = p;
     938            2547 :                         q = needle_u8_val + needle_u8_len;
     939                 :                         for (;;) {
     940            3459 :                                 if (q == needle_u8_val) {
     941             242 :                                         result = 0;
     942            3332 :                                         while (p > haystack_u8_val) {
     943            2848 :                                                 unsigned char c = *--p;
     944            2848 :                                                 if (c < 0x80) {
     945            1101 :                                                         ++result;
     946            1747 :                                                 } else if ((c & 0xc0) != 0x80) {
     947             693 :                                                         ++result;
     948                 :                                                 }       
     949                 :                                         }
     950             242 :                                         goto out;
     951                 :                                 }
     952            3217 :                                 if (*--q != *--p) {
     953            2305 :                                         break;
     954                 :                                 }
     955             912 :                         }
     956            2305 :                         p += jtbl[*p];
     957            2305 :                         if (p <= pv) {
     958               0 :                                 p = pv + 1;
     959                 :                         }
     960                 :                 }
     961                 :         } else {
     962                 :                 unsigned int jtbl[1 << (sizeof(unsigned char) * 8)];
     963             315 :                 unsigned int needle_u8_len = needle_u8->len, needle_len = 0;
     964                 :                 unsigned int i;
     965                 :                 const unsigned char *p, *e, *q, *qe;
     966             315 :                 const unsigned char *haystack_u8_val = haystack_u8->val,
     967             315 :                                     *needle_u8_val = needle_u8->val;
     968           80955 :                 for (i = 0; i < sizeof(jtbl) / sizeof(*jtbl); ++i) {
     969           80640 :                         jtbl[i] = needle_u8_len;
     970                 :                 }
     971            1303 :                 for (i = needle_u8_len - 1; i > 0; --i) {
     972             988 :                         unsigned char c = needle_u8_val[i];
     973             988 :                         jtbl[c] = i;
     974             988 :                         if (c < 0x80) {
     975             347 :                                 ++needle_len;
     976             641 :                         } else if ((c & 0xc0) != 0x80) {
     977             226 :                                 ++needle_len;
     978                 :                         }
     979                 :                 }
     980                 :                 {
     981             315 :                         unsigned char c = needle_u8_val[0];
     982             315 :                         if (c < 0x80) {
     983             218 :                                 ++needle_len;
     984              97 :                         } else if ((c & 0xc0) != 0x80) {
     985              97 :                                 ++needle_len;
     986                 :                         }
     987                 :                 }
     988             315 :                 e = haystack_u8_val;
     989             315 :                 p = e + haystack_u8->len;
     990             315 :                 qe = needle_u8_val + needle_u8_len;
     991             315 :                 if (offset < 0) {
     992              13 :                         if (-offset > needle_len) {
     993               9 :                                 offset += needle_len; 
     994             147 :                                 while (offset < 0) {
     995                 :                                         unsigned char c;
     996             129 :                                         if (p <= e) {
     997               0 :                                                 result = -16;
     998               0 :                                                 goto out;
     999                 :                                         }
    1000             129 :                                         c = *(--p);
    1001             129 :                                         if (c < 0x80) {
    1002              54 :                                                 ++offset;
    1003              75 :                                         } else if ((c & 0xc0) != 0x80) {
    1004              25 :                                                 ++offset;
    1005                 :                                         }
    1006                 :                                 }
    1007                 :                         }
    1008                 :                 } else {
    1009             302 :                         const unsigned char *ee = haystack_u8_val + haystack_u8->len;
    1010            1055 :                         while (--offset >= 0) {
    1011             451 :                                 if (e >= ee) {
    1012               0 :                                         result = -16;
    1013               0 :                                         goto out;
    1014                 :                                 }
    1015             451 :                                 e += u8_tbl[*e];
    1016                 :                         }
    1017                 :                 }
    1018             315 :                 if (p < e + needle_u8_len) {
    1019               5 :                         goto out;
    1020                 :                 }
    1021             310 :                 p -= needle_u8_len;
    1022            1435 :                 while (p >= e) {
    1023            1012 :                         const unsigned char *pv = p;
    1024            1012 :                         q = needle_u8_val;
    1025                 :                         for (;;) {
    1026            1793 :                                 if (q == qe) {
    1027             197 :                                         result = 0;
    1028             197 :                                         p -= needle_u8_len;
    1029            3041 :                                         while (p > haystack_u8_val) {
    1030            2647 :                                                 unsigned char c = *--p;
    1031            2647 :                                                 if (c < 0x80) {
    1032            1167 :                                                         ++result;
    1033            1480 :                                                 } else if ((c & 0xc0) != 0x80) {
    1034             612 :                                                         ++result;
    1035                 :                                                 }       
    1036                 :                                         }
    1037             197 :                                         goto out;
    1038                 :                                 }
    1039            1596 :                                 if (*q != *p) {
    1040             815 :                                         break;
    1041                 :                                 }
    1042             781 :                                 ++p, ++q;
    1043             781 :                         }
    1044             815 :                         p -= jtbl[*p];
    1045             815 :                         if (p >= pv) {
    1046               0 :                                 p = pv - 1;
    1047                 :                         }
    1048                 :                 }
    1049                 :         }
    1050             785 : out:
    1051             785 :         if (haystack_u8 == &_haystack_u8) {
    1052              66 :                 mbfl_string_clear(&_haystack_u8);
    1053                 :         }
    1054             785 :         if (needle_u8 == &_needle_u8) {
    1055              66 :                 mbfl_string_clear(&_needle_u8);
    1056                 :         }
    1057             785 :         return result;
    1058                 : }
    1059                 : 
    1060                 : /*
    1061                 :  *  substr_count
    1062                 :  */
    1063                 : 
    1064                 : int
    1065                 : mbfl_substr_count(
    1066                 :     mbfl_string *haystack,
    1067                 :     mbfl_string *needle
    1068                 :    )
    1069              65 : {
    1070              65 :         int n, result = 0;
    1071                 :         unsigned char *p;
    1072                 :         mbfl_convert_filter *filter;
    1073                 :         struct collector_strpos_data pc;
    1074                 : 
    1075              65 :         if (haystack == NULL || needle == NULL) {
    1076               0 :                 return -8;
    1077                 :         }
    1078                 :         /* needle is converted into wchar */
    1079              65 :         mbfl_wchar_device_init(&pc.needle);
    1080              65 :         filter = mbfl_convert_filter_new(
    1081                 :           needle->no_encoding,
    1082                 :           mbfl_no_encoding_wchar,
    1083                 :           mbfl_wchar_device_output, 0, &pc.needle);
    1084              65 :         if (filter == NULL) {
    1085               0 :                 return -4;
    1086                 :         }
    1087              65 :         p = needle->val;
    1088              65 :         n = needle->len;
    1089              65 :         if (p != NULL) {
    1090             456 :                 while (n > 0) {
    1091             326 :                         if ((*filter->filter_function)(*p++, filter) < 0) {
    1092               0 :                                 break;
    1093                 :                         }
    1094             326 :                         n--;
    1095                 :                 }
    1096                 :         }
    1097              65 :         mbfl_convert_filter_flush(filter);
    1098              65 :         mbfl_convert_filter_delete(filter);
    1099              65 :         pc.needle_len = pc.needle.pos;
    1100              65 :         if (pc.needle.buffer == NULL) {
    1101               0 :                 return -4;
    1102                 :         }
    1103              65 :         if (pc.needle_len <= 0) {
    1104               0 :                 mbfl_wchar_device_clear(&pc.needle);
    1105               0 :                 return -2;
    1106                 :         }
    1107                 :         /* initialize filter and collector data */
    1108              65 :         filter = mbfl_convert_filter_new(
    1109                 :           haystack->no_encoding,
    1110                 :           mbfl_no_encoding_wchar,
    1111                 :           collector_strpos, 0, &pc);
    1112              65 :         if (filter == NULL) {
    1113               0 :                 mbfl_wchar_device_clear(&pc.needle);
    1114               0 :                 return -4;
    1115                 :         }
    1116              65 :         pc.start = 0;
    1117              65 :         pc.output = 0;
    1118              65 :         pc.needle_pos = 0;
    1119              65 :         pc.found_pos = 0;
    1120              65 :         pc.matched_pos = -1;
    1121                 : 
    1122                 :         /* feed data */
    1123              65 :         p = haystack->val;
    1124              65 :         n = haystack->len;
    1125              65 :         if (p != NULL) {
    1126            7365 :                 while (n > 0) {
    1127            7235 :                         if ((*filter->filter_function)(*p++, filter) < 0) {
    1128               0 :                                 pc.matched_pos = -4;
    1129               0 :                                 break;
    1130                 :                         }
    1131            7235 :                         if (pc.matched_pos >= 0) {
    1132             647 :                                 ++result;
    1133             647 :                                 pc.matched_pos = -1;
    1134             647 :                                 pc.needle_pos = 0;
    1135                 :                         }
    1136            7235 :                         n--;
    1137                 :                 }
    1138                 :         }
    1139              65 :         mbfl_convert_filter_flush(filter);
    1140              65 :         mbfl_convert_filter_delete(filter);
    1141              65 :         mbfl_wchar_device_clear(&pc.needle);
    1142                 : 
    1143              65 :         return result;
    1144                 : }
    1145                 : 
    1146                 : /*
    1147                 :  *  substr
    1148                 :  */
    1149                 : struct collector_substr_data {
    1150                 :         mbfl_convert_filter *next_filter;
    1151                 :         int start;
    1152                 :         int stop;
    1153                 :         int output;
    1154                 : };
    1155                 : 
    1156                 : static int
    1157                 : collector_substr(int c, void* data)
    1158             128 : {
    1159             128 :         struct collector_substr_data *pc = (struct collector_substr_data*)data;
    1160                 : 
    1161             128 :         if (pc->output >= pc->stop) {
    1162              17 :                 return -1;
    1163                 :         }
    1164                 : 
    1165             111 :         if (pc->output >= pc->start) {
    1166              91 :                 (*pc->next_filter->filter_function)(c, pc->next_filter);
    1167                 :         }
    1168                 : 
    1169             111 :         pc->output++;
    1170                 : 
    1171             111 :         return c;
    1172                 : }
    1173                 : 
    1174                 : mbfl_string *
    1175                 : mbfl_substr(
    1176                 :     mbfl_string *string,
    1177                 :     mbfl_string *result,
    1178                 :     int from,
    1179                 :     int length)
    1180             404 : {
    1181                 :         const mbfl_encoding *encoding;
    1182                 :         int n, m, k, len, start, end;
    1183                 :         unsigned char *p, *w;
    1184                 :         const unsigned char *mbtab;
    1185                 : 
    1186             404 :         encoding = mbfl_no2encoding(string->no_encoding);
    1187             404 :         if (encoding == NULL || string == NULL || result == NULL) {
    1188               0 :                 return NULL;
    1189                 :         }
    1190             404 :         mbfl_string_init(result);
    1191             404 :         result->no_language = string->no_language;
    1192             404 :         result->no_encoding = string->no_encoding;
    1193                 : 
    1194             788 :         if ((encoding->flag & (MBFL_ENCTYPE_SBCS | MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_WCS2LE | MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_WCS4LE)) ||
    1195                 :            encoding->mblen_table != NULL) {
    1196             384 :                 len = string->len;
    1197             384 :                 start = from;
    1198             384 :                 end = from + length;
    1199             384 :                 if (encoding->flag & (MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_MWC2LE)) {
    1200               4 :                         start *= 2;
    1201               4 :                         end = start + length*2;
    1202             380 :                 } else if (encoding->flag & (MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_MWC4LE)) {
    1203               8 :                         start *= 4;
    1204               8 :                         end = start + length*4;
    1205             372 :                 } else if (encoding->mblen_table != NULL) {
    1206             285 :                         mbtab = encoding->mblen_table;
    1207             285 :                         start = 0;
    1208             285 :                         end = 0;
    1209             285 :                         n = 0;
    1210             285 :                         k = 0;
    1211             285 :                         p = string->val;
    1212             285 :                         if (p != NULL) {
    1213                 :                                 /* search start position */
    1214            1857 :                                 while (k <= from) {
    1215            1304 :                                         start = n;
    1216            1304 :                                         if (n >= len) {
    1217              17 :                                                 break;
    1218                 :                                         }
    1219            1287 :                                         m = mbtab[*p];
    1220            1287 :                                         n += m;
    1221            1287 :                                         p += m;
    1222            1287 :                                         k++;
    1223                 :                                 }
    1224                 :                                 /* detect end position */
    1225             285 :                                 k = 0;
    1226             285 :                                 end = start;
    1227            2231 :                                 while (k < length) {
    1228            1771 :                                         end = n;
    1229            1771 :                                         if (n >= len) {
    1230             110 :                                                 break;
    1231                 :                                         }
    1232            1661 :                                         m = mbtab[*p];
    1233            1661 :                                         n += m;
    1234            1661 :                                         p += m;
    1235            1661 :                                         k++;
    1236                 :                                 }
    1237                 :                         }
    1238                 :                 }
    1239                 : 
    1240             384 :                 if (start > len) {
    1241               0 :                         start = len;
    1242                 :                 }
    1243             384 :                 if (start < 0) {
    1244               0 :                         start = 0;
    1245                 :                 }
    1246             384 :                 if (end > len) {
    1247              22 :                         end = len;
    1248                 :                 }
    1249             384 :                 if (end < 0) {
    1250               0 :                         end = 0;
    1251                 :                 }
    1252             384 :                 if (start > end) {
    1253               0 :                         start = end;
    1254                 :                 }
    1255                 : 
    1256                 :                 /* allocate memory and copy */
    1257             384 :                 n = end - start;
    1258             384 :                 result->len = 0;
    1259             384 :                 result->val = w = (unsigned char*)mbfl_malloc((n + 8)*sizeof(unsigned char));
    1260             384 :                 if (w != NULL) {
    1261             384 :                         p = string->val;
    1262             384 :                         if (p != NULL) {
    1263             384 :                                 p += start;
    1264             384 :                                 result->len = n;
    1265            3976 :                                 while (n > 0) {
    1266            3208 :                                         *w++ = *p++;
    1267            3208 :                                         n--;
    1268                 :                                 }
    1269                 :                         }
    1270             384 :                         *w++ = '\0';
    1271             384 :                         *w++ = '\0';
    1272             384 :                         *w++ = '\0';
    1273             384 :                         *w = '\0';
    1274                 :                 } else {
    1275               0 :                         result = NULL;
    1276                 :                 }
    1277                 :         } else {
    1278                 :                 mbfl_memory_device device;
    1279                 :                 struct collector_substr_data pc;
    1280                 :                 mbfl_convert_filter *decoder;
    1281                 :                 mbfl_convert_filter *encoder;
    1282                 : 
    1283              20 :                 mbfl_memory_device_init(&device, length + 1, 0);
    1284              20 :                 mbfl_string_init(result);
    1285              20 :                 result->no_language = string->no_language;
    1286              20 :                 result->no_encoding = string->no_encoding;
    1287                 :                 /* output code filter */
    1288              20 :                 decoder = mbfl_convert_filter_new(
    1289                 :                     mbfl_no_encoding_wchar,
    1290                 :                     string->no_encoding,
    1291                 :                     mbfl_memory_device_output, 0, &device);
    1292                 :                 /* wchar filter */
    1293              20 :                 encoder = mbfl_convert_filter_new(
    1294                 :                     string->no_encoding,
    1295                 :                     mbfl_no_encoding_wchar,
    1296                 :                     collector_substr, 0, &pc);
    1297              20 :                 if (decoder == NULL || encoder == NULL) {
    1298               0 :                         mbfl_convert_filter_delete(encoder);
    1299               0 :                         mbfl_convert_filter_delete(decoder);
    1300               0 :                         return NULL;
    1301                 :                 }
    1302              20 :                 pc.next_filter = decoder;
    1303              20 :                 pc.start = from;
    1304              20 :                 pc.stop = from + length;
    1305              20 :                 pc.output = 0;
    1306                 : 
    1307                 :                 /* feed data */
    1308              20 :                 p = string->val;
    1309              20 :                 n = string->len;
    1310              20 :                 if (p != NULL) {
    1311             184 :                         while (n > 0) {
    1312             161 :                                 if ((*encoder->filter_function)(*p++, encoder) < 0) {
    1313              17 :                                         break;
    1314                 :                                 }
    1315             144 :                                 n--;
    1316                 :                         }
    1317                 :                 }
    1318                 : 
    1319              20 :                 mbfl_convert_filter_flush(encoder);
    1320              20 :                 mbfl_convert_filter_flush(decoder);
    1321              20 :                 result = mbfl_memory_device_result(&device, result);
    1322              20 :                 mbfl_convert_filter_delete(encoder);
    1323              20 :                 mbfl_convert_filter_delete(decoder);
    1324                 :         }
    1325                 : 
    1326             404 :         return result;
    1327                 : }
    1328                 : 
    1329                 : 
    1330                 : /*
    1331                 :  *  strcut
    1332                 :  */
    1333                 : mbfl_string *
    1334                 : mbfl_strcut(
    1335                 :     mbfl_string *string,
    1336                 :     mbfl_string *result,
    1337                 :     int from,
    1338                 :     int length)
    1339               8 : {
    1340                 :         const mbfl_encoding *encoding;
    1341                 :         int n, m, k, len, start, end;
    1342                 :         unsigned char *p, *w;
    1343                 :         const unsigned char *mbtab;
    1344                 :         mbfl_memory_device device;
    1345                 :         mbfl_convert_filter *encoder, *encoder_tmp, *decoder, *decoder_tmp;
    1346                 : 
    1347               8 :         encoding = mbfl_no2encoding(string->no_encoding);
    1348               8 :         if (encoding == NULL || string == NULL || result == NULL) {
    1349               0 :                 return NULL;
    1350                 :         }
    1351               8 :         mbfl_string_init(result);
    1352               8 :         result->no_language = string->no_language;
    1353               8 :         result->no_encoding = string->no_encoding;
    1354                 : 
    1355              16 :         if ((encoding->flag & (MBFL_ENCTYPE_SBCS | MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_WCS2LE | MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_WCS4LE)) ||
    1356                 :            encoding->mblen_table != NULL) {
    1357               8 :                 len = string->len;
    1358               8 :                 start = from;
    1359               8 :                 end = from + length;
    1360               8 :                 if (encoding->flag & (MBFL_ENCTYPE_WCS2BE | MBFL_ENCTYPE_WCS2LE)) {
    1361               0 :                         start /= 2;
    1362               0 :                         start *= 2;
    1363               0 :                         end = length/2;
    1364               0 :                         end *= 2;
    1365               0 :                         end += start;
    1366               8 :                 } else if (encoding->flag & (MBFL_ENCTYPE_WCS4BE | MBFL_ENCTYPE_WCS4LE)) {
    1367               0 :                         start /= 4;
    1368               0 :                         start *= 4;
    1369               0 :                         end = length/4;
    1370               0 :                         end *= 4;
    1371               0 :                         end += start;
    1372               8 :                 } else if (encoding->mblen_table != NULL) {
    1373               8 :                         mbtab = encoding->mblen_table;
    1374               8 :                         start = 0;
    1375               8 :                         end = 0;
    1376               8 :                         n = 0;
    1377               8 :                         p = string->val;
    1378               8 :                         if (p != NULL) {
    1379                 :                                 /* search start position */
    1380                 :                                 for (;;) {
    1381              25 :                                         m = mbtab[*p];
    1382              25 :                                         n += m;
    1383              25 :                                         p += m;
    1384              25 :                                         if (n > from) {
    1385               8 :                                                 break;
    1386                 :                                         }
    1387              17 :                                         start = n;
    1388              17 :                                 }
    1389                 :                                 /* search end position */
    1390               8 :                                 k = start + length;
    1391               8 :                                 if (k >= (int)string->len) {
    1392               6 :                                         end = string->len;
    1393                 :                                 } else {
    1394               2 :                                         end = start;
    1395              13 :                                         while (n <= k) {
    1396               9 :                                                 end = n;
    1397               9 :                                                 m = mbtab[*p];
    1398               9 :                                                 n += m;
    1399               9 :                                                 p += m;
    1400                 :                                         }
    1401                 :                                 }
    1402                 :                         }
    1403                 :                 }
    1404                 : 
    1405               8 :                 if (start > len) {
    1406               0 :                         start = len;
    1407                 :                 }
    1408               8 :                 if (start < 0) {
    1409               0 :                         start = 0;
    1410                 :                 }
    1411               8 :                 if (end > len) {
    1412               0 :                         end = len;
    1413                 :                 }
    1414               8 :                 if (end < 0) {
    1415               0 :                         end = 0;
    1416                 :                 }
    1417               8 :                 if (start > end) {
    1418               0 :                         start = end;
    1419                 :                 }
    1420                 :                 /* allocate memory and copy string */
    1421               8 :                 n = end - start;
    1422               8 :                 result->len = 0;
    1423               8 :                 result->val = w = (unsigned char*)mbfl_malloc((n + 8)*sizeof(unsigned char));
    1424               8 :                 if (w != NULL) {
    1425               8 :                         result->len = n;
    1426               8 :                         p = &(string->val[start]);
    1427             141 :                         while (n > 0) {
    1428             125 :                                 *w++ = *p++;
    1429             125 :                                 n--;
    1430                 :                         }
    1431               8 :                         *w++ = '\0';
    1432               8 :                         *w++ = '\0';
    1433               8 :                         *w++ = '\0';
    1434               8 :                         *w = '\0';
    1435                 :                 } else {
    1436               0 :                         result = NULL;
    1437                 :                 }
    1438                 :         } else {
    1439                 :                 /* wchar filter */
    1440               0 :                 encoder = mbfl_convert_filter_new(
    1441                 :                   string->no_encoding,
    1442                 :                   mbfl_no_encoding_wchar,
    1443                 :                   mbfl_filter_output_null, 0, 0);
    1444               0 :                 encoder_tmp = mbfl_convert_filter_new(
    1445                 :                   string->no_encoding,
    1446                 :                   mbfl_no_encoding_wchar,
    1447                 :                   mbfl_filter_output_null, 0, 0);
    1448                 :                 /* output code filter */
    1449               0 :                 decoder = mbfl_convert_filter_new(
    1450                 :                   mbfl_no_encoding_wchar,
    1451                 :                   string->no_encoding,
    1452                 :                   mbfl_memory_device_output, 0, &device);
    1453               0 :                 decoder_tmp = mbfl_convert_filter_new(
    1454                 :                   mbfl_no_encoding_wchar,
    1455                 :                   string->no_encoding,
    1456                 :                   mbfl_memory_device_output, 0, &device);
    1457               0 :                 if (encoder == NULL || encoder_tmp == NULL || decoder == NULL || decoder_tmp == NULL) {
    1458               0 :                         mbfl_convert_filter_delete(encoder);
    1459               0 :                         mbfl_convert_filter_delete(encoder_tmp);
    1460               0 :                         mbfl_convert_filter_delete(decoder);
    1461               0 :                         mbfl_convert_filter_delete(decoder_tmp);
    1462               0 :                         return NULL;
    1463                 :                 }
    1464               0 :                 mbfl_memory_device_init(&device, length + 8, 0);
    1465               0 :                 k = 0;
    1466               0 :                 n = 0;
    1467               0 :                 p = string->val;
    1468               0 :                 if (p != NULL) {
    1469                 :                         /* seartch start position */
    1470               0 :                         while (n < from) {
    1471               0 :                                 (*encoder->filter_function)(*p++, encoder);
    1472               0 :                                 n++;
    1473                 :                         }
    1474                 :                         /* output a little shorter than "length" */
    1475               0 :                         encoder->output_function = mbfl_filter_output_pipe;
    1476               0 :                         encoder->data = decoder;
    1477               0 :                         k = length - 20;
    1478               0 :                         len = string->len;
    1479               0 :                         while (n < len && device.pos < k) {
    1480               0 :                                 (*encoder->filter_function)(*p++, encoder);
    1481               0 :                                 n++;
    1482                 :                         }
    1483                 :                         /* detect end position */
    1484                 :                         for (;;) {
    1485                 :                                 /* backup current state */
    1486               0 :                                 k = device.pos;
    1487               0 :                                 mbfl_convert_filter_copy(encoder, encoder_tmp);
    1488               0 :                                 mbfl_convert_filter_copy(decoder, decoder_tmp);
    1489               0 :                                 if (n >= len) {
    1490               0 :                                         break;
    1491                 :                                 }
    1492                 :                                 /* feed 1byte and flush */
    1493               0 :                                 (*encoder->filter_function)(*p, encoder);
    1494               0 :                                 (*encoder->filter_flush)(encoder);
    1495               0 :                                 (*decoder->filter_flush)(decoder);
    1496               0 :                                 if (device.pos > length) {
    1497               0 :                                         break;
    1498                 :                                 }
    1499                 :                                 /* restore filter and re-feed data */
    1500               0 :                                 device.pos = k;
    1501               0 :                                 mbfl_convert_filter_copy(encoder_tmp, encoder);
    1502               0 :                                 mbfl_convert_filter_copy(decoder_tmp, decoder);
    1503               0 :                                 (*encoder->filter_function)(*p, encoder);
    1504               0 :                                 p++;
    1505               0 :                                 n++;
    1506               0 :                         }
    1507               0 :                         device.pos = k;
    1508               0 :                         mbfl_convert_filter_copy(encoder_tmp, encoder);
    1509               0 :                         mbfl_convert_filter_copy(decoder_tmp, decoder);
    1510               0 :                         mbfl_convert_filter_flush(encoder);
    1511               0 :                         mbfl_convert_filter_flush(decoder);
    1512                 :                 }
    1513               0 :                 result = mbfl_memory_device_result(&device, result);
    1514               0 :                 mbfl_convert_filter_delete(encoder);
    1515               0 :                 mbfl_convert_filter_delete(encoder_tmp);
    1516               0 :                 mbfl_convert_filter_delete(decoder);
    1517               0 :                 mbfl_convert_filter_delete(decoder_tmp);
    1518                 :         }
    1519                 : 
    1520               8 :         return result;
    1521                 : }
    1522                 : 
    1523                 : 
    1524                 : /*
    1525                 :  *  strwidth
    1526                 :  */
    1527                 : static int is_fullwidth(int c)
    1528            8394 : {
    1529                 :         int i;
    1530                 : 
    1531            8394 :         if (c < mbfl_eaw_table[0].begin) {
    1532            4408 :                 return 0;
    1533                 :         }
    1534                 : 
    1535          123193 :         for (i = 0; i < sizeof(mbfl_eaw_table) / sizeof(mbfl_eaw_table[0]); i++) {
    1536          119381 :                 if (mbfl_eaw_table[i].begin <= c && c <= mbfl_eaw_table[i].end) {
    1537             174 :                         return 1;
    1538                 :                 }
    1539                 :         }
    1540                 : 
    1541            3812 :         return 0;
    1542                 : }
    1543                 : 
    1544                 : static int
    1545                 : filter_count_width(int c, void* data)
    1546            8310 : {
    1547            8310 :         (*(int *)data) += (is_fullwidth(c) ? 2: 1);
    1548            8310 :         return c;
    1549                 : }
    1550                 : 
    1551                 : int
    1552                 : mbfl_strwidth(mbfl_string *string)
    1553            8262 : {
    1554                 :         int len, n;
    1555                 :         unsigned char *p;
    1556                 :         mbfl_convert_filter *filter;
    1557                 : 
    1558            8262 :         len = 0;
    1559            8262 :         if (string->len > 0 && string->val != NULL) {
    1560                 :                 /* wchar filter */
    1561            8262 :                 filter = mbfl_convert_filter_new(
    1562                 :                     string->no_encoding,
    1563                 :                     mbfl_no_encoding_wchar,
    1564                 :                     filter_count_width, 0, &len);
    1565            8262 :                 if (filter == NULL) {
    1566               0 :                         mbfl_convert_filter_delete(filter);
    1567               0 :                         return -1;
    1568                 :                 }
    1569                 : 
    1570                 :                 /* feed data */
    1571            8262 :                 p = string->val;
    1572            8262 :                 n = string->len;
    1573           49631 :                 while (n > 0) {
    1574           33107 :                         (*filter->filter_function)(*p++, filter);
    1575           33107 :                         n--;
    1576                 :                 }
    1577                 : 
    1578            8262 :                 mbfl_convert_filter_flush(filter);
    1579            8262 :                 mbfl_convert_filter_delete(filter);
    1580                 :         }
    1581                 : 
    1582            8262 :         return len;
    1583                 : }
    1584                 : 
    1585                 : 
    1586                 : /*
    1587                 :  *  strimwidth
    1588                 :  */
    1589                 : struct collector_strimwidth_data {
    1590                 :         mbfl_convert_filter *decoder;
    1591                 :         mbfl_convert_filter *decoder_backup;
    1592                 :         mbfl_memory_device device;
    1593                 :         int from;
    1594                 :         int width;
    1595                 :         int outwidth;
    1596                 :         int outchar;
    1597                 :         int status;
    1598                 :         int endpos;
    1599                 : };
    1600                 : 
    1601                 : static int
    1602                 : collector_strimwidth(int c, void* data)
    1603             105 : {
    1604             105 :         struct collector_strimwidth_data *pc = (struct collector_strimwidth_data*)data;
    1605                 : 
    1606             105 :         switch (pc->status) {
    1607                 :         case 10:
    1608               6 :                 (*pc->decoder->filter_function)(c, pc->decoder);
    1609               6 :                 break;
    1610                 :         default:
    1611              99 :                 if (pc->outchar >= pc->from) {
    1612              84 :                         pc->outwidth += (is_fullwidth(c) ? 2: 1);
    1613                 : 
    1614              84 :                         if (pc->outwidth > pc->width) {
    1615               5 :                                 if (pc->status == 0) {
    1616               3 :                                         pc->endpos = pc->device.pos;
    1617               3 :                                         mbfl_convert_filter_copy(pc->decoder, pc->decoder_backup);
    1618                 :                                 }
    1619               5 :                                 pc->status++;
    1620               5 :                                 (*pc->decoder->filter_function)(c, pc->decoder);
    1621               5 :                                 c = -1;
    1622                 :                         } else {
    1623              79 :                                 (*pc->decoder->filter_function)(c, pc->decoder);
    1624                 :                         }
    1625                 :                 }
    1626              99 :                 pc->outchar++;
    1627                 :                 break;
    1628                 :         }
    1629                 : 
    1630             105 :         return c;
    1631                 : }
    1632                 : 
    1633                 : mbfl_string *
    1634                 : mbfl_strimwidth(
    1635                 :     mbfl_string *string,
    1636                 :     mbfl_string *marker,
    1637                 :     mbfl_string *result,
    1638                 :     int from,
    1639                 :     int width)
    1640               5 : {
    1641                 :         struct collector_strimwidth_data pc;
    1642                 :         mbfl_convert_filter *encoder;
    1643                 :         int n, mkwidth;
    1644                 :         unsigned char *p;
    1645                 : 
    1646               5 :         if (string == NULL || result == NULL) {
    1647               0 :                 return NULL;
    1648                 :         }
    1649               5 :         mbfl_string_init(result);
    1650               5 :         result->no_language = string->no_language;
    1651               5 :         result->no_encoding = string->no_encoding;
    1652               5 :         mbfl_memory_device_init(&pc.device, width, 0);
    1653                 : 
    1654                 :         /* output code filter */
    1655               5 :         pc.decoder = mbfl_convert_filter_new(
    1656                 :             mbfl_no_encoding_wchar,
    1657                 :             string->no_encoding,
    1658                 :             mbfl_memory_device_output, 0, &pc.device);
    1659               5 :         pc.decoder_backup = mbfl_convert_filter_new(
    1660                 :             mbfl_no_encoding_wchar,
    1661                 :             string->no_encoding,
    1662                 :             mbfl_memory_device_output, 0, &pc.device);
    1663                 :         /* wchar filter */
    1664               5 :         encoder = mbfl_convert_filter_new(
    1665                 :             string->no_encoding,
    1666                 :             mbfl_no_encoding_wchar,
    1667                 :             collector_strimwidth, 0, &pc);
    1668               5 :         if (pc.decoder == NULL || pc.decoder_backup == NULL || encoder == NULL) {
    1669               0 :                 mbfl_convert_filter_delete(encoder);
    1670               0 :                 mbfl_convert_filter_delete(pc.decoder);
    1671               0 :                 mbfl_convert_filter_delete(pc.decoder_backup);
    1672               0 :                 return NULL;
    1673                 :         }
    1674               5 :         mkwidth = 0;
    1675               5 :         if (marker) {
    1676               5 :                 mkwidth = mbfl_strwidth(marker);
    1677                 :         }
    1678               5 :         pc.from = from;
    1679               5 :         pc.width = width - mkwidth;
    1680               5 :         pc.outwidth = 0;
    1681               5 :         pc.outchar = 0;
    1682               5 :         pc.status = 0;
    1683               5 :         pc.endpos = 0;
    1684                 : 
    1685                 :         /* feed data */
    1686               5 :         p = string->val;
    1687               5 :         n = string->len;
    1688               5 :         if (p != NULL) {
    1689             163 :                 while (n > 0) {
    1690             156 :                         n--;
    1691             156 :                         if ((*encoder->filter_function)(*p++, encoder) < 0) {
    1692               3 :                                 break;
    1693                 :                         }
    1694                 :                 }
    1695               5 :                 mbfl_convert_filter_flush(encoder);
    1696               8 :                 if (pc.status != 0 && mkwidth > 0) {
    1697               3 :                         pc.width += mkwidth;
    1698              11 :                         while (n > 0) {
    1699               7 :                                 if ((*encoder->filter_function)(*p++, encoder) < 0) {
    1700               2 :                                         break;
    1701                 :                                 }
    1702               5 :                                 n--;
    1703                 :                         }
    1704               3 :                         mbfl_convert_filter_flush(encoder);
    1705               3 :                         if (pc.status != 1) {
    1706               2 :                                 pc.status = 10;
    1707               2 :                                 pc.device.pos = pc.endpos;
    1708               2 :                                 mbfl_convert_filter_copy(pc.decoder_backup, pc.decoder);
    1709               2 :                                 mbfl_convert_filter_reset(encoder, marker->no_encoding, mbfl_no_encoding_wchar);
    1710               2 :                                 p = marker->val;
    1711               2 :                                 n = marker->len;
    1712              10 :                                 while (n > 0) {
    1713               6 :                                         if ((*encoder->filter_function)(*p++, encoder) < 0) {
    1714               0 :                                                 break;
    1715                 :                                         }
    1716               6 :                                         n--;
    1717                 :                                 }
    1718               2 :                                 mbfl_convert_filter_flush(encoder);
    1719                 :                         }
    1720               2 :                 } else if (pc.status != 0) {
    1721               0 :                         pc.device.pos = pc.endpos;
    1722               0 :                         mbfl_convert_filter_copy(pc.decoder_backup, pc.decoder);
    1723                 :                 }
    1724               5 :                 mbfl_convert_filter_flush(pc.decoder);
    1725                 :         }
    1726               5 :         result = mbfl_memory_device_result(&pc.device, result);
    1727               5 :         mbfl_convert_filter_delete(encoder);
    1728               5 :         mbfl_convert_filter_delete(pc.decoder);
    1729               5 :         mbfl_convert_filter_delete(pc.decoder_backup);
    1730                 : 
    1731               5 :         return result;
    1732                 : }
    1733                 : 
    1734                 : 
    1735                 : 
    1736                 : /*
    1737                 :  *  convert Hankaku and Zenkaku
    1738                 :  */
    1739                 : struct collector_hantozen_data {
    1740                 :         mbfl_convert_filter *next_filter;
    1741                 :         int mode;
    1742                 :         int status;
    1743                 :         int cache;
    1744                 : };
    1745                 : 
    1746                 : static const unsigned char hankana2zenkata_table[64] = {
    1747                 :         0x00,0x02,0x0C,0x0D,0x01,0xFB,0xF2,0xA1,0xA3,0xA5,
    1748                 :         0xA7,0xA9,0xE3,0xE5,0xE7,0xC3,0xFC,0xA2,0xA4,0xA6,
    1749                 :         0xA8,0xAA,0xAB,0xAD,0xAF,0xB1,0xB3,0xB5,0xB7,0xB9,
    1750                 :         0xBB,0xBD,0xBF,0xC1,0xC4,0xC6,0xC8,0xCA,0xCB,0xCC,
    1751                 :         0xCD,0xCE,0xCF,0xD2,0xD5,0xD8,0xDB,0xDE,0xDF,0xE0,
    1752                 :         0xE1,0xE2,0xE4,0xE6,0xE8,0xE9,0xEA,0xEB,0xEC,0xED,
    1753                 :         0xEF,0xF3,0x9B,0x9C
    1754                 : };
    1755                 : static const unsigned char hankana2zenhira_table[64] = {
    1756                 :         0x00,0x02,0x0C,0x0D,0x01,0xFB,0x92,0x41,0x43,0x45,
    1757                 :         0x47,0x49,0x83,0x85,0x87,0x63,0xFC,0x42,0x44,0x46,
    1758                 :         0x48,0x4A,0x4B,0x4D,0x4F,0x51,0x53,0x55,0x57,0x59,
    1759                 :         0x5B,0x5D,0x5F,0x61,0x64,0x66,0x68,0x6A,0x6B,0x6C,
    1760                 :         0x6D,0x6E,0x6F,0x72,0x75,0x78,0x7B,0x7E,0x7F,0x80,
    1761                 :         0x81,0x82,0x84,0x86,0x88,0x89,0x8A,0x8B,0x8C,0x8D,
    1762                 :         0x8F,0x93,0x9B,0x9C
    1763                 : };
    1764                 : static const unsigned char zenkana2hankana_table[84][2] = {
    1765                 :         {0x67,0x00},{0x71,0x00},{0x68,0x00},{0x72,0x00},{0x69,0x00},
    1766                 :         {0x73,0x00},{0x6A,0x00},{0x74,0x00},{0x6B,0x00},{0x75,0x00},
    1767                 :         {0x76,0x00},{0x76,0x9E},{0x77,0x00},{0x77,0x9E},{0x78,0x00},
    1768                 :         {0x78,0x9E},{0x79,0x00},{0x79,0x9E},{0x7A,0x00},{0x7A,0x9E},
    1769                 :         {0x7B,0x00},{0x7B,0x9E},{0x7C,0x00},{0x7C,0x9E},{0x7D,0x00},
    1770                 :         {0x7D,0x9E},{0x7E,0x00},{0x7E,0x9E},{0x7F,0x00},{0x7F,0x9E},
    1771                 :         {0x80,0x00},{0x80,0x9E},{0x81,0x00},{0x81,0x9E},{0x6F,0x00},
    1772                 :         {0x82,0x00},{0x82,0x9E},{0x83,0x00},{0x83,0x9E},{0x84,0x00},
    1773                 :         {0x84,0x9E},{0x85,0x00},{0x86,0x00},{0x87,0x00},{0x88,0x00},
    1774                 :         {0x89,0x00},{0x8A,0x00},{0x8A,0x9E},{0x8A,0x9F},{0x8B,0x00},
    1775                 :         {0x8B,0x9E},{0x8B,0x9F},{0x8C,0x00},{0x8C,0x9E},{0x8C,0x9F},
    1776                 :         {0x8D,0x00},{0x8D,0x9E},{0x8D,0x9F},{0x8E,0x00},{0x8E,0x9E},
    1777                 :         {0x8E,0x9F},{0x8F,0x00},{0x90,0x00},{0x91,0x00},{0x92,0x00},
    1778                 :         {0x93,0x00},{0x6C,0x00},{0x94,0x00},{0x6D,0x00},{0x95,0x00},
    1779                 :         {0x6E,0x00},{0x96,0x00},{0x97,0x00},{0x98,0x00},{0x99,0x00},
    1780                 :         {0x9A,0x00},{0x9B,0x00},{0x9C,0x00},{0x9C,0x00},{0x72,0x00},
    1781                 :         {0x74,0x00},{0x66,0x00},{0x9D,0x00},{0x73,0x9E}
    1782                 : };
    1783                 : 
    1784                 : static int
    1785                 : collector_hantozen(int c, void* data)
    1786               0 : {
    1787                 :         int s, mode, n;
    1788               0 :         struct collector_hantozen_data *pc = (struct collector_hantozen_data*)data;
    1789                 : 
    1790               0 :         s = c;
    1791               0 :         mode = pc->mode;
    1792                 : 
    1793               0 :         if (mode & 0xf) { /* hankaku to zenkaku */
    1794               0 :                 if ((mode & 0x1) && c >= 0x21 && c <= 0x7d && c != 0x22 && c != 0x27 && c != 0x5c) {  /* all except <"> <'> <\> <~> */
    1795               0 :                         s = c + 0xfee0;
    1796               0 :                 } else if ((mode & 0x2) && ((c >= 0x41 && c <= 0x5a) || (c >= 0x61 && c <= 0x7a))) {    /* alpha */
    1797               0 :                         s = c + 0xfee0;
    1798               0 :                 } else if ((mode & 0x4) && c >= 0x30 && c <= 0x39) {  /* num */
    1799               0 :                         s = c + 0xfee0;
    1800               0 :                 } else if ((mode & 0x8) && c == 0x20) {     /* spase */
    1801               0 :                         s = 0x3000;
    1802                 :                 }
    1803                 :         }
    1804                 : 
    1805               0 :         if (mode & 0xf0) { /* zenkaku to hankaku */
    1806               0 :                 if ((mode & 0x10) && c >= 0xff01 && c <= 0xff5d && c != 0xff02 && c != 0xff07 && c!= 0xff3c) {        /* all except <"> <'> <\> <~> */
    1807               0 :                         s = c - 0xfee0;
    1808               0 :                 } else if ((mode & 0x20) && ((c >= 0xff21 && c <= 0xff3a) || (c >= 0xff41 && c <= 0xff5a))) {   /* alpha */
    1809               0 :                         s = c - 0xfee0;
    1810               0 :                 } else if ((mode & 0x40) && (c >= 0xff10 && c <= 0xff19)) {   /* num */
    1811               0 :                         s = c - 0xfee0;
    1812               0 :                 } else if ((mode & 0x80) && (c == 0x3000)) {        /* spase */
    1813               0 :                         s = 0x20;
    1814               0 :                 } else if ((mode & 0x10) && (c == 0x2212)) {        /* MINUS SIGN */
    1815               0 :                         s = 0x2d;
    1816                 :                 }
    1817                 :         }
    1818                 : 
    1819               0 :         if (mode & 0x300) { /* hankaku kana to zenkaku kana */
    1820               0 :                 if ((mode & 0x100) && (mode & 0x800)) { /* hankaku kana to zenkaku katakana and glue voiced sound mark */
    1821               0 :                         if (c >= 0xff61 && c <= 0xff9f) {
    1822               0 :                                 if (pc->status) {
    1823               0 :                                         n = (pc->cache - 0xff60) & 0x3f;
    1824               0 :                                         if (c == 0xff9e && ((n >= 22 && n <= 36) || (n >= 42 && n <= 46))) {
    1825               0 :                                                 pc->status = 0;
    1826               0 :                                                 s = 0x3001 + hankana2zenkata_table[n];
    1827               0 :                                         } else if (c == 0xff9e && n == 19) {
    1828               0 :                                                 pc->status = 0;
    1829               0 :                                                 s = 0x30f4;
    1830               0 :                                         } else if (c == 0xff9f && (n >= 42 && n <= 46)) {
    1831               0 :                                                 pc->status = 0;
    1832               0 :                                                 s = 0x3002 + hankana2zenkata_table[n];
    1833                 :                                         } else {
    1834               0 :                                                 pc->status = 1;
    1835               0 :                                                 pc->cache = c;
    1836               0 :                                                 s = 0x3000 + hankana2zenkata_table[n];
    1837                 :                                         }
    1838                 :                                 } else {
    1839               0 :                                         pc->status = 1;
    1840               0 :                                         pc->cache = c;
    1841               0 :                                         return c;
    1842                 :                                 }
    1843                 :                         } else {
    1844               0 :                                 if (pc->status) {
    1845               0 :                                         n = (pc->cache - 0xff60) & 0x3f;
    1846               0 :                                         pc->status = 0;
    1847               0 :                                         (*pc->next_filter->filter_function)(0x3000 + hankana2zenkata_table[n], pc->next_filter);
    1848                 :                                 }
    1849                 :                         }
    1850               0 :                 } else if ((mode & 0x200) && (mode & 0x800)) {  /* hankaku kana to zenkaku hirangana and glue voiced sound mark */
    1851               0 :                         if (c >= 0xff61 && c <= 0xff9f) {
    1852               0 :                                 if (pc->status) {
    1853               0 :                                         n = (pc->cache - 0xff60) & 0x3f;
    1854               0 :                                         if (c == 0xff9e && ((n >= 22 && n <= 36) || (n >= 42 && n <= 46))) {
    1855               0 :                                                 pc->status = 0;
    1856               0 :                                                 s = 0x3001 + hankana2zenhira_table[n];
    1857               0 :                                         } else if (c == 0xff9f && (n >= 42 && n <= 46)) {
    1858               0 :                                                 pc->status = 0;
    1859               0 :                                                 s = 0x3002 + hankana2zenhira_table[n];
    1860                 :                                         } else {
    1861               0 :                                                 pc->status = 1;
    1862               0 :                                                 pc->cache = c;
    1863               0 :                                                 s = 0x3000 + hankana2zenhira_table[n];
    1864                 :                                         }
    1865                 :                                 } else {
    1866               0 :                                         pc->status = 1;
    1867               0 :                                         pc->cache = c;
    1868               0 :                                         return c;
    1869                 :                                 }
    1870                 :                         } else {
    1871               0 :                                 if (pc->status) {
    1872               0 :                                         n = (pc->cache - 0xff60) & 0x3f;
    1873               0 :                                         pc->status = 0;
    1874               0 :                                         (*pc->next_filter->filter_function)(0x3000 + hankana2zenhira_table[n], pc->next_filter);
    1875                 :                                 }
    1876                 :                         }
    1877               0 :                 } else if ((mode & 0x100) && c >= 0xff61 && c <= 0xff9f) {    /* hankaku kana to zenkaku katakana */
    1878               0 :                         s = 0x3000 + hankana2zenkata_table[c - 0xff60];
    1879               0 :                 } else if ((mode & 0x200) && c >= 0xff61 && c <= 0xff9f) {    /* hankaku kana to zenkaku hirangana */
    1880               0 :                         s = 0x3000 + hankana2zenhira_table[c - 0xff60];
    1881                 :                 }
    1882                 :         }
    1883                 : 
    1884               0 :         if (mode & 0x3000) {        /* Zenkaku kana to hankaku kana */
    1885               0 :                 if ((mode & 0x1000) && c >= 0x30a1 && c <= 0x30f4) {  /* Zenkaku katakana to hankaku kana */
    1886               0 :                         n = c - 0x30a1;
    1887               0 :                         if (zenkana2hankana_table[n][1] != 0) {
    1888               0 :                                 (*pc->next_filter->filter_function)(0xff00 + zenkana2hankana_table[n][0], pc->next_filter);
    1889               0 :                                 s = 0xff00 + zenkana2hankana_table[n][1];
    1890                 :                         } else {
    1891               0 :                                 s = 0xff00 + zenkana2hankana_table[n][0];
    1892                 :                         }
    1893               0 :                 } else if ((mode & 0x2000) && c >= 0x3041 && c <= 0x3093) {   /* Zenkaku hirangana to hankaku kana */
    1894               0 :                         n = c - 0x3041;
    1895               0 :                         if (zenkana2hankana_table[n][1] != 0) {
    1896               0 :                                 (*pc->next_filter->filter_function)(0xff00 + zenkana2hankana_table[n][0], pc->next_filter);
    1897               0 :                                 s = 0xff00 + zenkana2hankana_table[n][1];
    1898                 :                         } else {
    1899               0 :                                 s = 0xff00 + zenkana2hankana_table[n][0];
    1900                 :                         }
    1901               0 :                 } else if (c == 0x3001) {
    1902               0 :                         s = 0xff64;                             /* HALFWIDTH IDEOGRAPHIC COMMA */
    1903               0 :                 } else if (c == 0x3002) {
    1904               0 :                         s = 0xff61;                             /* HALFWIDTH IDEOGRAPHIC FULL STOP */
    1905               0 :                 } else if (c == 0x300c) {
    1906               0 :                         s = 0xff62;                             /* HALFWIDTH LEFT CORNER BRACKET */
    1907               0 :                 } else if (c == 0x300d) {
    1908               0 :                         s = 0xff63;                             /* HALFWIDTH RIGHT CORNER BRACKET */
    1909               0 :                 } else if (c == 0x309b) {
    1910               0 :                         s = 0xff9e;                             /* HALFWIDTH KATAKANA VOICED SOUND MARK */
    1911               0 :                 } else if (c == 0x309c) {
    1912               0 :                         s = 0xff9f;                             /* HALFWIDTH KATAKANA SEMI-VOICED SOUND MARK */
    1913               0 :                 } else if (c == 0x30fc) {
    1914               0 :                         s = 0xff70;                             /* HALFWIDTH KATAKANA-HIRAGANA PROLONGED SOUND MARK */
    1915               0 :                 } else if (c == 0x30fb) {
    1916               0 :                         s = 0xff65;                             /* HALFWIDTH KATAKANA MIDDLE DOT */
    1917                 :                 }
    1918               0 :         } else if (mode & 0x30000) { 
    1919               0 :                 if ((mode & 0x10000) && c >= 0x3041 && c <= 0x3093) { /* Zenkaku hirangana to Zenkaku katakana */
    1920               0 :                         s = c + 0x60;
    1921               0 :                 } else if ((mode & 0x20000) && c >= 0x30a1 && c <= 0x30f3) {  /* Zenkaku katakana to Zenkaku hirangana */
    1922               0 :                         s = c - 0x60;
    1923                 :                 }
    1924                 :         }
    1925                 : 
    1926               0 :         if (mode & 0x100000) {      /* special ascii to symbol */
    1927               0 :                 if (c == 0x5c) {
    1928               0 :                         s = 0xffe5;                             /* FULLWIDTH YEN SIGN */
    1929               0 :                 } else if (c == 0xa5) {         /* YEN SIGN */
    1930               0 :                         s = 0xffe5;                             /* FULLWIDTH YEN SIGN */
    1931               0 :                 } else if (c == 0x7e) {
    1932               0 :                         s = 0xffe3;                             /* FULLWIDTH MACRON */
    1933               0 :                 } else if (c == 0x203e) {       /* OVERLINE */
    1934               0 :                         s = 0xffe3;                             /* FULLWIDTH MACRON */
    1935               0 :                 } else if (c == 0x27) {
    1936               0 :                         s = 0x2019;                             /* RIGHT SINGLE QUOTATION MARK */
    1937               0 :                 } else if (c == 0x22) {
    1938               0 :                         s = 0x201d;                             /* RIGHT DOUBLE QUOTATION MARK */
    1939                 :                 }
    1940               0 :         } else if (mode & 0x200000) {       /* special symbol to ascii */
    1941               0 :                 if (c == 0xffe5) {                      /* FULLWIDTH YEN SIGN */
    1942               0 :                         s = 0x5c;
    1943               0 :                 } else if (c == 0xff3c) {       /* FULLWIDTH REVERSE SOLIDUS */
    1944               0 :                         s = 0x5c;
    1945               0 :                 } else if (c == 0xffe3) {       /* FULLWIDTH MACRON */
    1946               0 :                         s = 0x7e;
    1947               0 :                 } else if (c == 0x203e) {       /* OVERLINE */
    1948               0 :                         s = 0x7e;
    1949               0 :                 } else if (c == 0x2018) {       /* LEFT SINGLE QUOTATION MARK*/
    1950               0 :                         s = 0x27;
    1951               0 :                 } else if (c == 0x2019) {       /* RIGHT SINGLE QUOTATION MARK */
    1952               0 :                         s = 0x27;
    1953               0 :                 } else if (c == 0x201c) {       /* LEFT DOUBLE QUOTATION MARK */
    1954               0 :                         s = 0x22;
    1955               0 :                 } else if (c == 0x201d) {       /* RIGHT DOUBLE QUOTATION MARK */
    1956               0 :                         s = 0x22;
    1957                 :                 }
    1958                 :         }
    1959                 : 
    1960               0 :         if (mode & 0x400000) {      /* special ascii to symbol */
    1961               0 :                 if (c == 0x5c) {
    1962               0 :                         s = 0xff3c;                             /* FULLWIDTH REVERSE SOLIDUS */
    1963               0 :                 } else if (c == 0x7e) {
    1964               0 :                         s = 0xff5e;                             /* FULLWIDTH TILDE */
    1965               0 :                 } else if (c == 0x27) {
    1966               0 :                         s = 0xff07;                             /* FULLWIDTH APOSTROPHE */
    1967               0 :                 } else if (c == 0x22) {
    1968               0 :                         s = 0xff02;                             /* FULLWIDTH QUOTATION MARK */
    1969                 :                 }
    1970               0 :         } else if (mode & 0x800000) {       /* special symbol to ascii */
    1971               0 :                 if (c == 0xff3c) {                      /* FULLWIDTH REVERSE SOLIDUS */
    1972               0 :                         s = 0x5c;
    1973               0 :                 } else if (c == 0xff5e) {       /* FULLWIDTH TILDE */
    1974               0 :                         s = 0x7e;
    1975               0 :                 } else if (c == 0xff07) {       /* FULLWIDTH APOSTROPHE */
    1976               0 :                         s = 0x27;
    1977               0 :                 } else if (c == 0xff02) {       /* FULLWIDTH QUOTATION MARK */
    1978               0 :                         s = 0x22;
    1979                 :                 }
    1980                 :         }
    1981                 : 
    1982               0 :         return (*pc->next_filter->filter_function)(s, pc->next_filter);
    1983                 : }
    1984                 : 
    1985                 : static int
    1986                 : collector_hantozen_flush(struct collector_hantozen_data *pc)
    1987               0 : {
    1988                 :         int ret, n;
    1989                 : 
    1990               0 :         ret = 0;
    1991               0 :         if (pc->status) {
    1992               0 :                 n = (pc->cache - 0xff60) & 0x3f;
    1993               0 :                 if (pc->mode & 0x100) {  /* hankaku kana to zenkaku katakana */
    1994               0 :                         ret = (*pc->next_filter->filter_function)(0x3000 + hankana2zenkata_table[n], pc->next_filter);
    1995               0 :                 } else if (pc->mode & 0x200) {   /* hankaku kana to zenkaku hirangana */
    1996               0 :                         ret = (*pc->next_filter->filter_function)(0x3000 + hankana2zenhira_table[n], pc->next_filter);
    1997                 :                 }
    1998               0 :                 pc->status = 0;
    1999                 :         }
    2000                 : 
    2001               0 :         return ret;
    2002                 : }
    2003                 : 
    2004                 : mbfl_string *
    2005                 : mbfl_ja_jp_hantozen(
    2006                 :     mbfl_string *string,
    2007                 :     mbfl_string *result,
    2008                 :     int mode)
    2009               0 : {
    2010                 :         int n;
    2011                 :         unsigned char *p;
    2012                 :         const mbfl_encoding *encoding;
    2013                 :         mbfl_memory_device device;
    2014                 :         struct collector_hantozen_data pc;
    2015                 :         mbfl_convert_filter *decoder;
    2016                 :         mbfl_convert_filter *encoder;
    2017                 : 
    2018                 :         /* initialize */
    2019               0 :         if (string == NULL || result == NULL) {
    2020               0 :                 return NULL;
    2021                 :         }
    2022               0 :         encoding = mbfl_no2encoding(string->no_encoding);
    2023               0 :         if (encoding == NULL) {
    2024               0 :                 return NULL;
    2025                 :         }
    2026               0 :         mbfl_memory_device_init(&device, string->len, 0);
    2027               0 :         mbfl_string_init(result);
    2028               0 :         result->no_language = string->no_language;
    2029               0 :         result->no_encoding = string->no_encoding;
    2030               0 :         decoder = mbfl_convert_filter_new(
    2031                 :           mbfl_no_encoding_wchar,
    2032                 :           string->no_encoding,
    2033                 :           mbfl_memory_device_output, 0, &device);
    2034               0 :         encoder = mbfl_convert_filter_new(
    2035                 :           string->no_encoding,
    2036                 :           mbfl_no_encoding_wchar,
    2037                 :           collector_hantozen, 0, &pc);
    2038               0 :         if (decoder == NULL || encoder == NULL) {
    2039               0 :                 mbfl_convert_filter_delete(encoder);
    2040               0 :                 mbfl_convert_filter_delete(decoder);
    2041               0 :                 return NULL;
    2042                 :         }
    2043               0 :         pc.next_filter = decoder;
    2044               0 :         pc.mode = mode;
    2045               0 :         pc.status = 0;
    2046               0 :         pc.cache = 0;
    2047                 : 
    2048                 :         /* feed data */
    2049               0 :         p = string->val;
    2050               0 :         n = string->len;
    2051               0 :         if (p != NULL) {
    2052               0 :                 while (n > 0) {
    2053               0 :                         if ((*encoder->filter_function)(*p++, encoder) < 0) {
    2054               0 :                                 break;
    2055                 :                         }
    2056               0 :                         n--;
    2057                 :                 }
    2058                 :         }
    2059                 : 
    2060               0 :         mbfl_convert_filter_flush(encoder);
    2061               0 :         collector_hantozen_flush(&pc);
    2062               0 :         mbfl_convert_filter_flush(decoder);
    2063               0 :         result = mbfl_memory_device_result(&device, result);
    2064               0 :         mbfl_convert_filter_delete(encoder);
    2065               0 :         mbfl_convert_filter_delete(decoder);
    2066                 : 
    2067               0 :         return result;
    2068                 : }
    2069                 : 
    2070                 : 
    2071                 : /*
    2072                 :  *  MIME header encode
    2073                 :  */
    2074                 : struct mime_header_encoder_data {
    2075                 :         mbfl_convert_filter *conv1_filter;
    2076                 :         mbfl_convert_filter *block_filter;
    2077                 :         mbfl_convert_filter *conv2_filter;
    2078                 :         mbfl_convert_filter *conv2_filter_backup;
    2079                 :         mbfl_convert_filter *encod_filter;
    2080                 :         mbfl_convert_filter *encod_filter_backup;
    2081                 :         mbfl_memory_device outdev;
    2082                 :         mbfl_memory_device tmpdev;
    2083                 :         int status1;
    2084                 :         int status2;
    2085                 :         int prevpos;
    2086                 :         int linehead;
    2087                 :         int firstindent;
    2088                 :         int encnamelen;
    2089                 :         int lwsplen;
    2090                 :         char encname[128];
    2091                 :         char lwsp[16];
    2092                 : };
    2093                 : 
    2094                 : static int
    2095                 : mime_header_encoder_block_collector(int c, void *data)
    2096           10421 : {
    2097                 :         int n;
    2098           10421 :         struct mime_header_encoder_data *pe = (struct mime_header_encoder_data *)data;
    2099                 : 
    2100           10421 :         switch (pe->status2) {
    2101                 :         case 1: /* encoded word */
    2102           10130 :                 pe->prevpos = pe->outdev.pos;
    2103           10130 :                 mbfl_convert_filter_copy(pe->conv2_filter, pe->conv2_filter_backup);
    2104           10130 :                 mbfl_convert_filter_copy(pe->encod_filter, pe->encod_filter_backup);
    2105           10130 :                 (*pe->conv2_filter->filter_function)(c, pe->conv2_filter);
    2106           10130 :                 (*pe->conv2_filter->filter_flush)(pe->conv2_filter);
    2107           10130 :                 (*pe->encod_filter->filter_flush)(pe->encod_filter);
    2108           10130 :                 n = pe->outdev.pos - pe->linehead + pe->firstindent;
    2109           10130 :                 pe->outdev.pos = pe->prevpos;
    2110           10130 :                 mbfl_convert_filter_copy(pe->conv2_filter_backup, pe->conv2_filter);
    2111           10130 :                 mbfl_convert_filter_copy(pe->encod_filter_backup, pe->encod_filter);
    2112           10130 :                 if (n >= 74) {
    2113             478 :                         (*pe->conv2_filter->filter_flush)(pe->conv2_filter);
    2114             478 :                         (*pe->encod_filter->filter_flush)(pe->encod_filter);
    2115             478 :                         mbfl_memory_device_strncat(&pe->outdev, "\x3f\x3d", 2);        /* ?= */
    2116             478 :                         mbfl_memory_device_strncat(&pe->outdev, pe->lwsp, pe->lwsplen);
    2117             478 :                         pe->linehead = pe->outdev.pos;
    2118             478 :                         pe->firstindent = 0;
    2119             478 :                         mbfl_memory_device_strncat(&pe->outdev, pe->encname, pe->encnamelen);
    2120             478 :                         c = (*pe->conv2_filter->filter_function)(c, pe->conv2_filter);
    2121                 :                 } else {
    2122            9652 :                         c = (*pe->conv2_filter->filter_function)(c, pe->conv2_filter);
    2123                 :                 }
    2124           10130 :                 break;
    2125                 : 
    2126                 :         default:
    2127             291 :                 mbfl_memory_device_strncat(&pe->outdev, pe->encname, pe->encnamelen);
    2128             291 :                 c = (*pe->conv2_filter->filter_function)(c, pe->conv2_filter);
    2129             291 :                 pe->status2 = 1;
    2130                 :                 break;
    2131                 :         }
    2132                 : 
    2133           10421 :         return c;
    2134                 : }
    2135                 : 
    2136                 : static int
    2137                 : mime_header_encoder_collector(int c, void *data)
    2138           10656 : {
    2139                 :         static int qp_table[256] = {
    2140                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x00 */
    2141                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x00 */
    2142                 :                 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20 */
    2143                 :                 0, 0, 0, 0, 0, 0, 0 ,0, 0, 0, 0, 0, 0, 1, 0, 1, /* 0x10 */
    2144                 :                 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x40 */
    2145                 :                 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, /* 0x50 */
    2146                 :                 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x60 */
    2147                 :                 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, /* 0x70 */
    2148                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x80 */
    2149                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x90 */
    2150                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0xA0 */
    2151                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0xB0 */
    2152                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0xC0 */
    2153                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0xD0 */
    2154                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0xE0 */
    2155                 :                 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1  /* 0xF0 */
    2156                 :         };
    2157                 : 
    2158                 :         int n;
    2159           10656 :         struct mime_header_encoder_data *pe = (struct mime_header_encoder_data *)data;
    2160                 : 
    2161           10656 :         switch (pe->status1) {
    2162                 :         case 11:        /* encoded word */
    2163           10128 :                 (*pe->block_filter->filter_function)(c, pe->block_filter);
    2164           10128 :                 break;
    2165                 : 
    2166                 :         default:        /* ASCII */
    2167             750 :                 if (c <= 0x00ff && !qp_table[(c & 0xff)]) { /* ordinary characters */
    2168             222 :                         mbfl_memory_device_output(c, &pe->tmpdev);
    2169             222 :                         pe->status1 = 1;
    2170             306 :                 } else if (pe->status1 == 0 && c == 0x20) {  /* repeat SPACE */
    2171               0 :                         mbfl_memory_device_output(c, &pe->tmpdev);
    2172                 :                 } else {
    2173             321 :                         if (pe->tmpdev.pos < 74 && c == 0x20) {
    2174              15 :                                 n = pe->outdev.pos - pe->linehead + pe->tmpdev.pos + pe->firstindent;
    2175              15 :                                 if (n > 74) {
    2176               0 :                                         mbfl_memory_device_strncat(&pe->outdev, pe->lwsp, pe->lwsplen);            /* LWSP */
    2177               0 :                                         pe->linehead = pe->outdev.pos;
    2178               0 :                                         pe->firstindent = 0;
    2179              15 :                                 } else if (pe->outdev.pos > 0) {
    2180               9 :                                         mbfl_memory_device_output(0x20, &pe->outdev);
    2181                 :                                 }
    2182              15 :                                 mbfl_memory_device_devcat(&pe->outdev, &pe->tmpdev);
    2183              15 :                                 mbfl_memory_device_reset(&pe->tmpdev);
    2184              15 :                                 pe->status1 = 0;
    2185                 :                         } else {
    2186             291 :                                 n = pe->outdev.pos - pe->linehead + pe->encnamelen + pe->firstindent;
    2187             291 :                                 if (n > 60)  {
    2188              46 :                                         mbfl_memory_device_strncat(&pe->outdev, pe->lwsp, pe->lwsplen);            /* LWSP */
    2189              46 :                                         pe->linehead = pe->outdev.pos;
    2190              46 :                                         pe->firstindent = 0;
    2191             245 :                                 } else if (pe->outdev.pos > 0)  {
    2192               0 :                                         mbfl_memory_device_output(0x20, &pe->outdev);
    2193                 :                                 }
    2194             291 :                                 mbfl_convert_filter_devcat(pe->block_filter, &pe->tmpdev);
    2195             291 :                                 mbfl_memory_device_reset(&pe->tmpdev);
    2196             291 :                                 (*pe->block_filter->filter_function)(c, pe->block_filter);
    2197             291 :                                 pe->status1 = 11;
    2198                 :                         }
    2199                 :                 }
    2200                 :                 break;
    2201                 :         }
    2202                 : 
    2203           10656 :         return c;
    2204                 : }
    2205                 : 
    2206                 : mbfl_string *
    2207                 : mime_header_encoder_result(struct mime_header_encoder_data *pe, mbfl_string *result)
    2208             325 : {
    2209             325 :         if (pe->status1 >= 10) {
    2210             291 :                 (*pe->conv2_filter->filter_flush)(pe->conv2_filter);
    2211             291 :                 (*pe->encod_filter->filter_flush)(pe->encod_filter);
    2212             291 :                 mbfl_memory_device_strncat(&pe->outdev, "\x3f\x3d", 2);                /* ?= */
    2213              34 :         } else if (pe->tmpdev.pos > 0) {
    2214              26 :                 if (pe->outdev.pos > 0) {
    2215               6 :                         if ((pe->outdev.pos - pe->linehead + pe->tmpdev.pos) > 74) {
    2216               0 :                                 mbfl_memory_device_strncat(&pe->outdev, pe->lwsp, pe->lwsplen);
    2217                 :                         } else {
    2218               6 :                                 mbfl_memory_device_output(0x20, &pe->outdev);
    2219                 :                         }
    2220                 :                 }
    2221              26 :                 mbfl_memory_device_devcat(&pe->outdev, &pe->tmpdev);
    2222                 :         }
    2223             325 :         mbfl_memory_device_reset(&pe->tmpdev);
    2224             325 :         pe->prevpos = 0;
    2225             325 :         pe->linehead = 0;
    2226             325 :         pe->status1 = 0;
    2227             325 :         pe->status2 = 0;
    2228                 : 
    2229             325 :         return mbfl_memory_device_result(&pe->outdev, result);
    2230                 : }
    2231                 : 
    2232                 : struct mime_header_encoder_data*
    2233                 : mime_header_encoder_new(
    2234                 :     enum mbfl_no_encoding incode,
    2235                 :     enum mbfl_no_encoding outcode,
    2236                 :     enum mbfl_no_encoding transenc)
    2237             325 : {
    2238                 :         int n;
    2239                 :         const char *s;
    2240                 :         const mbfl_encoding *outencoding;
    2241                 :         struct mime_header_encoder_data *pe;
    2242                 : 
    2243                 :         /* get output encoding and check MIME charset name */
    2244             325 :         outencoding = mbfl_no2encoding(outcode);
    2245             325 :         if (outencoding == NULL || outencoding->mime_name == NULL || outencoding->mime_name[0] == '\0') {
    2246               0 :                 return NULL;
    2247                 :         }
    2248                 : 
    2249             325 :         pe = (struct mime_header_encoder_data*)mbfl_malloc(sizeof(struct mime_header_encoder_data));
    2250             325 :         if (pe == NULL) {
    2251               0 :                 return NULL;
    2252                 :         }
    2253                 : 
    2254             325 :         mbfl_memory_device_init(&pe->outdev, 0, 0);
    2255             325 :         mbfl_memory_device_init(&pe->tmpdev, 0, 0);
    2256             325 :         pe->prevpos = 0;
    2257             325 :         pe->linehead = 0;
    2258             325 :         pe->firstindent = 0;
    2259             325 :         pe->status1 = 0;
    2260             325 :         pe->status2 = 0;
    2261                 : 
    2262                 :         /* make the encoding description string  exp. "=?ISO-2022-JP?B?" */
    2263             325 :         n = 0;
    2264             325 :         pe->encname[n++] = 0x3d;
    2265             325 :         pe->encname[n++] = 0x3f;
    2266             325 :         s = outencoding->mime_name;
    2267            2324 :         while (*s) {
    2268            1674 :                 pe->encname[n++] = *s++;
    2269                 :         }
    2270             325 :         pe->encname[n++] = 0x3f;
    2271             325 :         if (transenc == mbfl_no_encoding_qprint) {
    2272             111 :                 pe->encname[n++] = 0x51;
    2273                 :         } else {
    2274             214 :                 pe->encname[n++] = 0x42;
    2275             214 :                 transenc = mbfl_no_encoding_base64;
    2276                 :         }
    2277             325 :         pe->encname[n++] = 0x3f;
    2278             325 :         pe->encname[n] = '\0';
    2279             325 :         pe->encnamelen = n;
    2280                 : 
    2281             325 :         n = 0;
    2282             325 :         pe->lwsp[n++] = 0x0d;
    2283             325 :         pe->lwsp[n++] = 0x0a;
    2284             325 :         pe->lwsp[n++] = 0x20;
    2285             325 :         pe->lwsp[n] = '\0';
    2286             325 :         pe->lwsplen = n;
    2287                 : 
    2288                 :         /* transfer encode filter */
    2289             325 :         pe->encod_filter = mbfl_convert_filter_new(outcode, transenc, mbfl_memory_device_output, 0, &(pe->outdev));
    2290             325 :         pe->encod_filter_backup = mbfl_convert_filter_new(outcode, transenc, mbfl_memory_device_output, 0, &(pe->outdev));
    2291                 : 
    2292                 :         /* Output code filter */
    2293             325 :         pe->conv2_filter = mbfl_convert_filter_new(mbfl_no_encoding_wchar, outcode, mbfl_filter_output_pipe, 0, pe->encod_filter);
    2294             325 :         pe->conv2_filter_backup = mbfl_convert_filter_new(mbfl_no_encoding_wchar, outcode, mbfl_filter_output_pipe, 0, pe->encod_filter);
    2295                 : 
    2296                 :         /* encoded block filter */
    2297             325 :         pe->block_filter = mbfl_convert_filter_new(mbfl_no_encoding_wchar, mbfl_no_encoding_wchar, mime_header_encoder_block_collector, 0, pe);
    2298                 : 
    2299                 :         /* Input code filter */
    2300             325 :         pe->conv1_filter = mbfl_convert_filter_new(incode, mbfl_no_encoding_wchar, mime_header_encoder_collector, 0, pe);
    2301                 : 
    2302             325 :         if (pe->encod_filter == NULL ||
    2303                 :             pe->encod_filter_backup == NULL ||
    2304                 :             pe->conv2_filter == NULL ||
    2305                 :             pe->conv2_filter_backup == NULL ||
    2306                 :             pe->conv1_filter == NULL) {
    2307               0 :                 mime_header_encoder_delete(pe);
    2308               0 :                 return NULL;
    2309                 :         }
    2310                 : 
    2311             325 :         if (transenc == mbfl_no_encoding_qprint) {
    2312             111 :                 pe->encod_filter->status |= MBFL_QPRINT_STS_MIME_HEADER;
    2313             111 :                 pe->encod_filter_backup->status |= MBFL_QPRINT_STS_MIME_HEADER;
    2314                 :         } else {
    2315             214 :                 pe->encod_filter->status |= MBFL_BASE64_STS_MIME_HEADER;
    2316             214 :                 pe->encod_filter_backup->status |= MBFL_BASE64_STS_MIME_HEADER;
    2317                 :         }
    2318                 : 
    2319             325 :         return pe;
    2320                 : }
    2321                 : 
    2322                 : void
    2323                 : mime_header_encoder_delete(struct mime_header_encoder_data *pe)
    2324             325 : {
    2325             325 :         if (pe) {
    2326             325 :                 mbfl_convert_filter_delete(pe->conv1_filter);
    2327             325 :                 mbfl_convert_filter_delete(pe->block_filter);
    2328             325 :                 mbfl_convert_filter_delete(pe->conv2_filter);
    2329             325 :                 mbfl_convert_filter_delete(pe->conv2_filter_backup);
    2330             325 :                 mbfl_convert_filter_delete(pe->encod_filter);
    2331             325 :                 mbfl_convert_filter_delete(pe->encod_filter_backup);
    2332             325 :                 mbfl_memory_device_clear(&pe->outdev);
    2333             325 :                 mbfl_memory_device_clear(&pe->tmpdev);
    2334             325 :                 mbfl_free((void*)pe);
    2335                 :         }
    2336             325 : }
    2337                 : 
    2338                 : int
    2339                 : mime_header_encoder_feed(int c, struct mime_header_encoder_data *pe)
    2340               0 : {
    2341               0 :         return (*pe->conv1_filter->filter_function)(c, pe->conv1_filter);
    2342                 : }
    2343                 : 
    2344                 : mbfl_string *
    2345                 : mbfl_mime_header_encode(
    2346                 :     mbfl_string *string,
    2347                 :     mbfl_string *result,
    2348                 :     enum mbfl_no_encoding outcode,
    2349                 :     enum mbfl_no_encoding encoding,
    2350                 :     const char *linefeed,
    2351                 :     int indent)
    2352             325 : {
    2353                 :         int n;
    2354                 :         unsigned char *p;
    2355                 :         struct mime_header_encoder_data *pe;
    2356                 : 
    2357             325 :         mbfl_string_init(result);
    2358             325 :         result->no_language = string->no_language;
    2359             325 :         result->no_encoding = mbfl_no_encoding_ascii;
    2360                 : 
    2361             325 :         pe = mime_header_encoder_new(string->no_encoding, outcode, encoding);
    2362             325 :         if (pe == NULL) {
    2363               0 :                 return NULL;
    2364                 :         }
    2365                 : 
    2366             325 :         if (linefeed != NULL) {
    2367             325 :                 n = 0;
    2368            1310 :                 while (*linefeed && n < 8) {
    2369             660 :                         pe->lwsp[n++] = *linefeed++;
    2370                 :                 }
    2371             325 :                 pe->lwsp[n++] = 0x20;
    2372             325 :                 pe->lwsp[n] = '\0';
    2373             325 :                 pe->lwsplen = n;
    2374                 :         }
    2375             325 :         if (indent > 0 && indent < 74) {
    2376             237 :                 pe->firstindent = indent;
    2377                 :         }
    2378                 : 
    2379             325 :         n = string->len;
    2380             325 :         p = string->val;
    2381           18016 :         while (n > 0) {
    2382           17366 :                 (*pe->conv1_filter->filter_function)(*p++, pe->conv1_filter);
    2383           17366 :                 n--;
    2384                 :         }
    2385                 : 
    2386             325 :         result = mime_header_encoder_result(pe, result);
    2387             325 :         mime_header_encoder_delete(pe);
    2388                 : 
    2389             325 :         return result;
    2390                 : }
    2391                 : 
    2392                 : 
    2393                 : /*
    2394                 :  *  MIME header decode
    2395                 :  */
    2396                 : struct mime_header_decoder_data {
    2397                 :         mbfl_convert_filter *deco_filter;
    2398                 :         mbfl_convert_filter *conv1_filter;
    2399                 :         mbfl_convert_filter *conv2_filter;
    2400                 :         mbfl_memory_device outdev;
    2401                 :         mbfl_memory_device tmpdev;
    2402                 :         int cspos;
    2403                 :         int status;
    2404                 :         enum mbfl_no_encoding encoding;
    2405                 :         enum mbfl_no_encoding incode;
    2406                 :         enum mbfl_no_encoding outcode;
    2407                 : };
    2408                 : 
    2409                 : static int
    2410                 : mime_header_decoder_collector(int c, void* data)
    2411            1085 : {
    2412                 :         const mbfl_encoding *encoding;
    2413            1085 :         struct mime_header_decoder_data *pd = (struct mime_header_decoder_data*)data;
    2414                 : 
    2415            1085 :         switch (pd->status) {
    2416                 :         case 1:
    2417              17 :                 if (c == 0x3f) {                /* ? */
    2418              17 :                         mbfl_memory_device_output(c, &pd->tmpdev);
    2419              17 :                         pd->cspos = pd->tmpdev.pos;
    2420              17 :                         pd->status = 2;
    2421                 :                 } else {
    2422               0 :                         mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2423               0 :                         mbfl_memory_device_reset(&pd->tmpdev);
    2424               0 :                         if (c == 0x3d) {                /* = */
    2425               0 :                                 mbfl_memory_device_output(c, &pd->tmpdev);
    2426               0 :                         } else if (c == 0x0d || c == 0x0a) {    /* CR or LF */
    2427               0 :                                 pd->status = 9;
    2428                 :                         } else {
    2429               0 :                                 (*pd->conv1_filter->filter_function)(c, pd->conv1_filter);
    2430               0 :                                 pd->status = 0;
    2431                 :                         }
    2432                 :                 }
    2433              17 :                 break;
    2434                 :         case 2:         /* store charset string */
    2435             157 :                 if (c == 0x3f) {                /* ? */
    2436                 :                         /* identify charset */
    2437              17 :                         mbfl_memory_device_output('\0', &pd->tmpdev);
    2438              17 :                         encoding = mbfl_name2encoding((const char *)&pd->tmpdev.buffer[pd->cspos]);
    2439              17 :                         if (encoding != NULL) {
    2440              17 :                                 pd->incode = encoding->no_encoding;
    2441              17 :                                 pd->status = 3;
    2442                 :                         }
    2443              17 :                         mbfl_memory_device_unput(&pd->tmpdev);
    2444              17 :                         mbfl_memory_device_output(c, &pd->tmpdev);
    2445                 :                 } else {
    2446             140 :                         mbfl_memory_device_output(c, &pd->tmpdev);
    2447             140 :                         if (pd->tmpdev.pos > 100) {               /* too long charset string */
    2448               0 :                                 pd->status = 0;
    2449             140 :                         } else if (c == 0x0d || c == 0x0a) {    /* CR or LF */
    2450               0 :                                 mbfl_memory_device_unput(&pd->tmpdev);
    2451               0 :                                 pd->status = 9;
    2452                 :                         }
    2453             140 :                         if (pd->status != 2) {
    2454               0 :                                 mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2455               0 :                                 mbfl_memory_device_reset(&pd->tmpdev);
    2456                 :                         }
    2457                 :                 }
    2458             157 :                 break;
    2459                 :         case 3:         /* identify encoding */
    2460              17 :                 mbfl_memory_device_output(c, &pd->tmpdev);
    2461              23 :                 if (c == 0x42 || c == 0x62) {           /* 'B' or 'b' */
    2462               6 :                         pd->encoding = mbfl_no_encoding_base64;
    2463               6 :                         pd->status = 4;
    2464              22 :                 } else if (c == 0x51 || c == 0x71) {    /* 'Q' or 'q' */
    2465              11 :                         pd->encoding = mbfl_no_encoding_qprint;
    2466              11 :                         pd->status = 4;
    2467                 :                 } else {
    2468               0 :                         if (c == 0x0d || c == 0x0a) {   /* CR or LF */
    2469               0 :                                 mbfl_memory_device_unput(&pd->tmpdev);
    2470               0 :                                 pd->status = 9;
    2471                 :                         } else {
    2472               0 :                                 pd->status = 0;
    2473                 :                         }
    2474               0 :                         mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2475               0 :                         mbfl_memory_device_reset(&pd->tmpdev);
    2476                 :                 }
    2477              17 :                 break;
    2478                 :         case 4:         /* reset filter */
    2479              17 :                 mbfl_memory_device_output(c, &pd->tmpdev);
    2480              17 :                 if (c == 0x3f) {                /* ? */
    2481                 :                         /* charset convert filter */
    2482              17 :                         mbfl_convert_filter_reset(pd->conv1_filter, pd->incode, mbfl_no_encoding_wchar);
    2483                 :                         /* decode filter */
    2484              17 :                         mbfl_convert_filter_reset(pd->deco_filter, pd->encoding, mbfl_no_encoding_8bit);
    2485              17 :                         pd->status = 5;
    2486                 :                 } else {
    2487               0 :                         if (c == 0x0d || c == 0x0a) {   /* CR or LF */
    2488               0 :                                 mbfl_memory_device_unput(&pd->tmpdev);
    2489               0 :                                 pd->status = 9;
    2490                 :                         } else {
    2491               0 :                                 pd->status = 0;
    2492                 :                         }
    2493               0 :                         mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2494                 :                 }
    2495              17 :                 mbfl_memory_device_reset(&pd->tmpdev);
    2496              17 :                 break;
    2497                 :         case 5:         /* encoded block */
    2498             758 :                 if (c == 0x3f) {                /* ? */
    2499              17 :                         pd->status = 6;
    2500                 :                 } else {
    2501             741 :                         (*pd->deco_filter->filter_function)(c, pd->deco_filter);
    2502                 :                 }
    2503             758 :                 break;
    2504                 :         case 6:         /* check end position */
    2505              17 :                 if (c == 0x3d) {                /* = */
    2506                 :                         /* flush and reset filter */
    2507              17 :                         (*pd->deco_filter->filter_flush)(pd->deco_filter);
    2508              17 :                         (*pd->conv1_filter->filter_flush)(pd->conv1_filter);
    2509              17 :                         mbfl_convert_filter_reset(pd->conv1_filter, mbfl_no_encoding_ascii, mbfl_no_encoding_wchar);
    2510              17 :                         pd->status = 7;
    2511                 :                 } else {
    2512               0 :                         (*pd->deco_filter->filter_function)(0x3f, pd->deco_filter);
    2513               0 :                         if (c != 0x3f) {                /* ? */
    2514               0 :                                 (*pd->deco_filter->filter_function)(c, pd->deco_filter);
    2515               0 :                                 pd->status = 5;
    2516                 :                         }
    2517                 :                 }
    2518              17 :                 break;
    2519                 :         case 7:         /* after encoded block */
    2520              12 :                 if (c == 0x0d || c == 0x0a) {   /* CR LF */
    2521               6 :                         pd->status = 8;
    2522                 :                 } else {
    2523               0 :                         mbfl_memory_device_output(c, &pd->tmpdev);
    2524               0 :                         if (c == 0x3d) {                /* = */
    2525               0 :                                 pd->status = 1;
    2526               0 :                         } else if (c != 0x20 && c != 0x09) {            /* not space */
    2527               0 :                                 mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2528               0 :                                 mbfl_memory_device_reset(&pd->tmpdev);
    2529               0 :                                 pd->status = 0;
    2530                 :                         }
    2531                 :                 }
    2532               6 :                 break;
    2533                 :         case 8:         /* folding */
    2534                 :         case 9:         /* folding */
    2535               6 :                 if (c != 0x0d && c != 0x0a && c != 0x20 && c != 0x09) {
    2536               6 :                         if (c == 0x3d) {                /* = */
    2537               6 :                                 if (pd->status == 8) {
    2538               6 :                                         mbfl_memory_device_output(0x20, &pd->tmpdev);    /* SPACE */
    2539                 :                                 } else {
    2540               0 :                                         (*pd->conv1_filter->filter_function)(0x20, pd->conv1_filter);
    2541                 :                                 }
    2542               6 :                                 mbfl_memory_device_output(c, &pd->tmpdev);
    2543               6 :                                 pd->status = 1;
    2544                 :                         } else {
    2545               0 :                                 mbfl_memory_device_output(0x20, &pd->tmpdev);
    2546               0 :                                 mbfl_memory_device_output(c, &pd->tmpdev);
    2547               0 :                                 mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2548               0 :                                 mbfl_memory_device_reset(&pd->tmpdev);
    2549               0 :                                 pd->status = 0;
    2550                 :                         }
    2551                 :                 }
    2552               6 :                 break;
    2553                 :         default:                /* non encoded block */
    2554              90 :                 if (c == 0x0d || c == 0x0a) {   /* CR LF */
    2555               0 :                         pd->status = 9;
    2556              90 :                 } else if (c == 0x3d) {         /* = */
    2557              11 :                         mbfl_memory_device_output(c, &pd->tmpdev);
    2558              11 :                         pd->status = 1;
    2559                 :                 } else {
    2560              79 :                         (*pd->conv1_filter->filter_function)(c, pd->conv1_filter);
    2561                 :                 }
    2562                 :                 break;
    2563                 :         }
    2564                 : 
    2565            1085 :         return c;
    2566                 : }
    2567                 : 
    2568                 : mbfl_string *
    2569                 : mime_header_decoder_result(struct mime_header_decoder_data *pd, mbfl_string *result)
    2570              31 : {
    2571              31 :         switch (pd->status) {
    2572                 :         case 1:
    2573                 :         case 2:
    2574                 :         case 3:
    2575                 :         case 4:
    2576                 :         case 7:
    2577                 :         case 8:
    2578                 :         case 9:
    2579              11 :                 mbfl_convert_filter_devcat(pd->conv1_filter, &pd->tmpdev);
    2580              11 :                 break;
    2581                 :         case 5:
    2582                 :         case 6:
    2583               0 :                 (*pd->deco_filter->filter_flush)(pd->deco_filter);
    2584               0 :                 (*pd->conv1_filter->filter_flush)(pd->conv1_filter);
    2585                 :                 break;
    2586                 :         }
    2587              31 :         (*pd->conv2_filter->filter_flush)(pd->conv2_filter);
    2588              31 :         mbfl_memory_device_reset(&pd->tmpdev);
    2589              31 :         pd->status = 0;
    2590                 : 
    2591              31 :         return mbfl_memory_device_result(&pd->outdev, result);
    2592                 : }
    2593                 : 
    2594                 : struct mime_header_decoder_data*
    2595                 : mime_header_decoder_new(enum mbfl_no_encoding outcode)
    2596              31 : {
    2597                 :         struct mime_header_decoder_data *pd;
    2598                 : 
    2599              31 :         pd = (struct mime_header_decoder_data*)mbfl_malloc(sizeof(struct mime_header_decoder_data));
    2600              31 :         if (pd == NULL) {
    2601               0 :                 return NULL;
    2602                 :         }
    2603                 : 
    2604              31 :         mbfl_memory_device_init(&pd->outdev, 0, 0);
    2605              31 :         mbfl_memory_device_init(&pd->tmpdev, 0, 0);
    2606              31 :         pd->cspos = 0;
    2607              31 :         pd->status = 0;
    2608              31 :         pd->encoding = mbfl_no_encoding_pass;
    2609              31 :         pd->incode = mbfl_no_encoding_ascii;
    2610              31 :         pd->outcode = outcode;
    2611                 :         /* charset convert filter */
    2612              31 :         pd->conv2_filter = mbfl_convert_filter_new(mbfl_no_encoding_wchar, pd->outcode, mbfl_memory_device_output, 0, &pd->outdev);
    2613              31 :         pd->conv1_filter = mbfl_convert_filter_new(pd->incode, mbfl_no_encoding_wchar, mbfl_filter_output_pipe, 0, pd->conv2_filter);
    2614                 :         /* decode filter */
    2615              31 :         pd->deco_filter = mbfl_convert_filter_new(pd->encoding, mbfl_no_encoding_8bit, mbfl_filter_output_pipe, 0, pd->conv1_filter);
    2616                 : 
    2617              31 :         if (pd->conv1_filter == NULL || pd->conv2_filter == NULL || pd->deco_filter == NULL) {
    2618               0 :                 mime_header_decoder_delete(pd);
    2619               0 :                 return NULL;
    2620                 :         }
    2621                 : 
    2622              31 :         return pd;
    2623                 : }
    2624                 : 
    2625                 : void
    2626                 : mime_header_decoder_delete(struct mime_header_decoder_data *pd)
    2627              31 : {
    2628              31 :         if (pd) {
    2629              31 :                 mbfl_convert_filter_delete(pd->conv2_filter);
    2630              31 :                 mbfl_convert_filter_delete(pd->conv1_filter);
    2631              31 :                 mbfl_convert_filter_delete(pd->deco_filter);
    2632              31 :                 mbfl_memory_device_clear(&pd->outdev);
    2633              31 :                 mbfl_memory_device_clear(&pd->tmpdev);
    2634              31 :                 mbfl_free((void*)pd);
    2635                 :         }
    2636              31 : }
    2637                 : 
    2638                 : int
    2639                 : mime_header_decoder_feed(int c, struct mime_header_decoder_data *pd)
    2640               0 : {
    2641               0 :         return mime_header_decoder_collector(c, pd);
    2642                 : }
    2643                 : 
    2644                 : mbfl_string *
    2645                 : mbfl_mime_header_decode(
    2646                 :     mbfl_string *string,
    2647                 :     mbfl_string *result,
    2648                 :     enum mbfl_no_encoding outcode)
    2649              31 : {
    2650                 :         int n;
    2651                 :         unsigned char *p;
    2652                 :         struct mime_header_decoder_data *pd;
    2653                 : 
    2654              31 :         mbfl_string_init(result);
    2655              31 :         result->no_language = string->no_language;
    2656              31 :         result->no_encoding = outcode;
    2657                 : 
    2658              31 :         pd = mime_header_decoder_new(outcode);
    2659              31 :         if (pd == NULL) {
    2660               0 :                 return NULL;
    2661                 :         }
    2662                 : 
    2663                 :         /* feed data */
    2664              31 :         n = string->len;
    2665              31 :         p = string->val;
    2666            1147 :         while (n > 0) {
    2667            1085 :                 mime_header_decoder_collector(*p++, pd);
    2668            1085 :                 n--;
    2669                 :         }
    2670                 : 
    2671              31 :         result = mime_header_decoder_result(pd, result);
    2672              31 :         mime_header_decoder_delete(pd);
    2673                 : 
    2674              31 :         return result;
    2675                 : }
    2676                 : 
    2677                 : 
    2678                 : 
    2679                 : /*
    2680                 :  *  convert HTML numeric entity
    2681                 :  */
    2682                 : struct collector_htmlnumericentity_data {
    2683                 :         mbfl_convert_filter *decoder;
    2684                 :         int status;
    2685                 :         int cache;
    2686                 :         int digit;
    2687                 :         int *convmap;
    2688                 :         int mapsize;
    2689                 : };
    2690                 : 
    2691                 : static int
    2692                 : collector_encode_htmlnumericentity(int c, void *data)
    2693               0 : {
    2694               0 :         struct collector_htmlnumericentity_data *pc = (struct collector_htmlnumericentity_data *)data;
    2695                 :         int f, n, s, r, d, size, *mapelm;
    2696                 : 
    2697               0 :         size = pc->mapsize;
    2698               0 :         f = 0;
    2699               0 :         n = 0;
    2700               0 :         while (n < size) {
    2701               0 :                 mapelm = &(pc->convmap[n*4]);
    2702               0 :                 if (c >= mapelm[0] && c <= mapelm[1]) {
    2703               0 :                         s = (c + mapelm[2]) & mapelm[3];
    2704               0 :                         if (s >= 0) {
    2705               0 :                                 (*pc->decoder->filter_function)(0x26, pc->decoder);    /* '&' */
    2706               0 :                                 (*pc->decoder->filter_function)(0x23, pc->decoder);    /* '#' */
    2707               0 :                                 r = 100000000;
    2708               0 :                                 s %= r;
    2709               0 :                                 while (r > 0) {
    2710               0 :                                         d = s/r;
    2711               0 :                                         if (d || f) {
    2712               0 :                                                 f = 1;
    2713               0 :                                                 s %= r;
    2714               0 :                                                 (*pc->decoder->filter_function)(mbfl_hexchar_table[d], pc->decoder);
    2715                 :                                         }
    2716               0 :                                         r /= 10;
    2717                 :                                 }
    2718               0 :                                 if (!f) {
    2719               0 :                                         f = 1;
    2720               0 :                                         (*pc->decoder->filter_function)(mbfl_hexchar_table[0], pc->decoder);
    2721                 :                                 }
    2722               0 :                                 (*pc->decoder->filter_function)(0x3b, pc->decoder);            /* ';' */
    2723                 :                         }
    2724                 :                 }
    2725               0 :                 if (f) {
    2726               0 :                         break;
    2727                 :                 }
    2728               0 :                 n++;
    2729                 :         }
    2730               0 :         if (!f) {
    2731               0 :                 (*pc->decoder->filter_function)(c, pc->decoder);
    2732                 :         }
    2733                 : 
    2734               0 :         return c;
    2735                 : }
    2736                 : 
    2737                 : static int
    2738                 : collector_decode_htmlnumericentity(int c, void *data)
    2739               0 : {
    2740               0 :         struct collector_htmlnumericentity_data *pc = (struct collector_htmlnumericentity_data *)data;
    2741                 :         int f, n, s, r, d, size, *mapelm;
    2742                 : 
    2743               0 :         switch (pc->status) {
    2744                 :         case 1:
    2745               0 :                 if (c == 0x23) {        /* '#' */
    2746               0 :                         pc->status = 2;
    2747                 :                 } else {
    2748               0 :                         pc->status = 0;
    2749               0 :                         (*pc->decoder->filter_function)(0x26, pc->decoder);            /* '&' */
    2750               0 :                         (*pc->decoder->filter_function)(c, pc->decoder);
    2751                 :                 }
    2752               0 :                 break;
    2753                 :         case 2:
    2754               0 :                 if (c >= 0x30 && c <= 0x39) {     /* '0' - '9' */
    2755               0 :                         pc->cache = c - 0x30;
    2756               0 :                         pc->status = 3;
    2757               0 :                         pc->digit = 1;
    2758                 :                 } else {
    2759               0 :                         pc->status = 0;
    2760               0 :                         (*pc->decoder->filter_function)(0x26, pc->decoder);            /* '&' */
    2761               0 :                         (*pc->decoder->filter_function)(0x23, pc->decoder);            /* '#' */
    2762               0 :                         (*pc->decoder->filter_function)(c, pc->decoder);
    2763                 :                 }
    2764               0 :                 break;
    2765                 :         case 3:
    2766               0 :                 s = 0;
    2767               0 :                 f = 0;
    2768               0 :                 if (c >= 0x30 && c <= 0x39) {     /* '0' - '9' */
    2769               0 :                         if (pc->digit > 9) {
    2770               0 :                                 pc->status = 0;
    2771               0 :                                 s = pc->cache;
    2772               0 :                                 f = 1;
    2773                 :                         } else {
    2774               0 :                                 s = pc->cache*10 + c - 0x30;
    2775               0 :                                 pc->cache = s;
    2776               0 :                                 pc->digit++;
    2777                 :                         }
    2778                 :                 } else {
    2779               0 :                         pc->status = 0;
    2780               0 :                         s = pc->cache;
    2781               0 :                         f = 1;
    2782               0 :                         n = 0;
    2783               0 :                         size = pc->mapsize;
    2784               0 :                         while (n < size) {
    2785               0 :                                 mapelm = &(pc->convmap[n*4]);
    2786               0 :                                 d = s - mapelm[2];
    2787               0 :                                 if (d >= mapelm[0] && d <= mapelm[1]) {
    2788               0 :                                         f = 0;
    2789               0 :                                         (*pc->decoder->filter_function)(d, pc->decoder);
    2790               0 :                                         if (c != 0x3b) {        /* ';' */
    2791               0 :                                                 (*pc->decoder->filter_function)(c, pc->decoder);
    2792                 :                                         }
    2793               0 :                                         break;
    2794                 :                                 }
    2795               0 :                                 n++;
    2796                 :                         }
    2797                 :                 }
    2798               0 :                 if (f) {
    2799               0 :                         (*pc->decoder->filter_function)(0x26, pc->decoder);            /* '&' */
    2800               0 :                         (*pc->decoder->filter_function)(0x23, pc->decoder);            /* '#' */
    2801               0 :                         r = 1;
    2802               0 :                         n = pc->digit;
    2803               0 :                         while (n > 0) {
    2804               0 :                                 r *= 10;
    2805               0 :                                 n--;
    2806                 :                         }
    2807               0 :                         s %= r;
    2808               0 :                         r /= 10;
    2809               0 :                         while (r > 0) {
    2810               0 :                                 d = s/r;
    2811               0 :                                 s %= r;
    2812               0 :                                 r /= 10;
    2813               0 :                                 (*pc->decoder->filter_function)(mbfl_hexchar_table[d], pc->decoder);
    2814                 :                         }
    2815               0 :                         (*pc->decoder->filter_function)(c, pc->decoder);
    2816                 :                 }
    2817               0 :                 break;
    2818                 :         default:
    2819               0 :                 if (c == 0x26) {        /* '&' */
    2820               0 :                         pc->status = 1;
    2821                 :                 } else {
    2822               0 :                         (*pc->decoder->filter_function)(c, pc->decoder);
    2823                 :                 }
    2824                 :                 break;
    2825                 :         }
    2826                 : 
    2827               0 :         return c;
    2828                 : }
    2829                 : 
    2830                 : mbfl_string *
    2831                 : mbfl_html_numeric_entity(
    2832                 :     mbfl_string *string,
    2833                 :     mbfl_string *result,
    2834                 :     int *convmap,
    2835                 :     int mapsize,
    2836                 :     int type)
    2837               0 : {
    2838                 :         struct collector_htmlnumericentity_data pc;
    2839                 :         mbfl_memory_device device;
    2840                 :         mbfl_convert_filter *encoder;
    2841                 :         int n;
    2842                 :         unsigned char *p;
    2843                 : 
    2844               0 :         if (string == NULL || result == NULL) {
    2845               0 :                 return NULL;
    2846                 :         }
    2847               0 :         mbfl_string_init(result);
    2848               0 :         result->no_language = string->no_language;
    2849               0 :         result->no_encoding = string->no_encoding;
    2850               0 :         mbfl_memory_device_init(&device, string->len, 0);
    2851                 : 
    2852                 :         /* output code filter */
    2853               0 :         pc.decoder = mbfl_convert_filter_new(
    2854                 :             mbfl_no_encoding_wchar,
    2855                 :             string->no_encoding,
    2856                 :             mbfl_memory_device_output, 0, &device);
    2857                 :         /* wchar filter */
    2858               0 :         if (type == 0) {
    2859               0 :                 encoder = mbfl_convert_filter_new(
    2860                 :                     string->no_encoding,
    2861                 :                     mbfl_no_encoding_wchar,
    2862                 :                     collector_encode_htmlnumericentity, 0, &pc);
    2863                 :         } else {
    2864               0 :                 encoder = mbfl_convert_filter_new(
    2865                 :                     string->no_encoding,
    2866                 :                     mbfl_no_encoding_wchar,
    2867                 :                     collector_decode_htmlnumericentity, 0, &pc);
    2868                 :         }
    2869               0 :         if (pc.decoder == NULL || encoder == NULL) {
    2870               0 :                 mbfl_convert_filter_delete(encoder);
    2871               0 :                 mbfl_convert_filter_delete(pc.decoder);
    2872               0 :                 return NULL;
    2873                 :         }
    2874               0 :         pc.status = 0;
    2875               0 :         pc.cache = 0;
    2876               0 :         pc.digit = 0;
    2877               0 :         pc.convmap = convmap;
    2878               0 :         pc.mapsize = mapsize;
    2879                 : 
    2880                 :         /* feed data */
    2881               0 :         p = string->val;
    2882               0 :         n = string->len;
    2883               0 :         if (p != NULL) {
    2884               0 :                 while (n > 0) {
    2885               0 :                         if ((*encoder->filter_function)(*p++, encoder) < 0) {
    2886               0 :                                 break;
    2887                 :                         }
    2888               0 :                         n--;
    2889                 :                 }
    2890                 :         }
    2891               0 :         mbfl_convert_filter_flush(encoder);
    2892               0 :         mbfl_convert_filter_flush(pc.decoder);
    2893               0 :         result = mbfl_memory_device_result(&device, result);
    2894               0 :         mbfl_convert_filter_delete(encoder);
    2895               0 :         mbfl_convert_filter_delete(pc.decoder);
    2896                 : 
    2897               0 :         return result;
    2898                 : }
    2899                 : 
    2900                 : /*
    2901                 :  * Local variables:
    2902                 :  * tab-width: 4
    2903                 :  * c-basic-offset: 4
    2904                 :  * End:
    2905                 :  */

Generated by: LTP GCOV extension version 1.5

Generated at Thu, 19 Nov 2009 08:20:11 +0000 (5 days ago)

Copyright © 2005-2009 The PHP Group
All rights reserved.