Jcode.pm - a successor to jcode.pl, a bridge to Encode.pm

[English][Japanese]

Last Modified:Sunday, 11-May-2008 03:16:47 JST

by Dan Kogai


As of Perl 5.8.0, all the Jcode capabilities are avaialble in standard distribution via Encode module Though I will maitain Jcode for old perls, I recommend that you use Encode if your perl is new enough. Encode more featureful, robust, and most of all, standard.

As of Jcode 2.0, Jcode now acts as a wrapper to Encode for perl 5.8.1 and better.

In those cases, Jcode.pm is self-contained so you don't even have to make install. All you have to do is copy Jcode.pm to the library path, a la jcode.pl.

For older perls, Jcode works the same as version 0.88.


Jcode.pm

Is a Perl module that handles various Japanese charsets. It has all features available on jcode.pl-2.10 PLUS

Here is the documentation, Jcode.html Which is pod2html'd from Jcode.pm.

History

See Changes for changes. Old Changes are available as Changes.ver0X.

Questions and bug reports

Now that Jcode.pm is an official OpenLab Project, Jcode.pm has an official mailing list for that purpose.


Install

is easy.

via CPAN

This is more recommended because of the network bandwidth. Also note CPAN modules checks the version number so you can always download the latest version that way.

  1. Make sure you have perl5. Jcode.pm is for perl5 only.
  2. run CPAN module. Just

    $ perl -MCPAN -e shell

    via shell

  3. if this is your first time running CPAN module, it will ask a series of questions, mostly about your network. Just fill it in and you are ready to go. As for the choice of mirror sites, try

    http://www.ring.gr.jp/pub/lang/perl/CPAN/

    If you are in Japan.

  4. now all you have to do is "install Jcode" as follows;

    cpan> install Jcode

    and CPAN module takes care of the rest. It downloads the necessary tarball, untargiz, make, make test and make install.

via Tarball

If you are too insatiable to wait for CPAN to update, here's how.

  1. Make sure you have perl5. Jcode.pm is for perl5 only.
  2. download Jcode-2.07.tar.gz or Jcode-2.07.zip.
  3. $ gunzip Jcode-*.tar.gz | tar xf -
    # or "tar zxf Jcode-x.xx.tar.gz" if your tar is gnu
  4. $ cd Jcode-*
  5. $ perl Makefile.PL; make; make install

Attention: Pre-0.50 users

If you have installed Jcode prior to version 0.50 manually (That is, without "perl Makefile.PL" trick. So this applies to Mac and Windows users only), please delete old version of Jcode.pm and Jcode directory in @INC manually before installing the new version.

When unicode conversion is needed, newer version of Jcode first tries to load Jcode::Unicode and if it fails (case for Mac, Windows, or such environment where XS is not supported), load Jcode::Unicode::NoXS. So Jcode gets confused if an older version of Jcode/Unicode.pm file still remains. Sorry for inconvenience.



Examples

here's some.

Migrating from jcode.pl

  1. Replace all occurance of "require 'jcode.pl';" with "use Jcode;"
  2. Replace all occurance of "jcode::" with "Jcode::"

This much should suffice for most cases.

Convert files at once.

perl -MJcode -i.bak -lpe 'Jcode::convert(\$_, "charcode")' files...

or

perl -MJcode -i.bak -lne 'print jcode(\$_)->charcode' files...

original files are ".bak"

Check your mailbox

perl -MJcode -00lne 'print jcode($_)->mime_decode->charcode' $mail


Why reinvent the wheel?

With Encode This section is somewhat obsolete but I'll leave it anyhow.

Virtually all Japanese perl coders must have used jcode.pl, a perl code that converts Japanese text from one char set or another. While jcode.pl has all the functinalities necessary, it has following problems;

Perl5-compliance

While it runs OK on perl5, you have to use typegrobs or references to givejcode::convert() right arguments. It doesn't look great to say

jcode::convert(\$str, 'jis', jcode::getcode(\$str), "z");
print $str
;

Wouldn't it be nice if you can go like;

print jcode($str)->h2z->jis;

to do the same?

MIME header support

RFC1522 states that converting string to iso-2022-jp is not enought to put it into MIME header. You have to further convert the string with base64, then sandwitch that with =?ISO-2022-JP?B? and ?=. Wouldn't it be nice if you can

$header = jcode($str)->encode_mime;

Unicode support

I am not a big fan of Unicode but we have to admit the future is there...


Valid HTML 4.01!