develooper Front page | perl.perl5.porters | Postings from November 2008

Re: [perl #60472] Module Encode degrades in Perl 5.10

Thread Previous | Thread Next
From:
demerphq
Date:
November 11, 2008 09:12
Subject:
Re: [perl #60472] Module Encode degrades in Perl 5.10
Message ID:
9b18b3110811110912v7932862dq91bbdc4accacd701@mail.gmail.com
2008/11/11 demerphq <demerphq@gmail.com>:
> 2008/11/11 mihara@twister.dev.iwa.fujixerox.co.jp (via RT)
> <perlbug-followup@perl.org>:
>> # New Ticket Created by  mihara@twister.dev.iwa.fujixerox.co.jp
>> # Please include the string:  [perl #60472]
>> # in the subject line of all future correspondence about this issue.
>> # <URL: http://rt.perl.org/rt3/Ticket/Display.html?id=60472 >
>>
>>
>>
>> This is a bug report for perl from mihara@twister.dev.iwa.fujixerox.co.jp,
>> generated with the help of perlbug 1.36 running under perl 5.10.0.
>>
>>
>> -----------------------------------------------------------------
>> [Please enter your report here]
>> I found this bug while playing with Encode::IMAPUTF7 module.
>> Encode::IMAPUTF7 become to have a problem when I upgrade Perl from 5.8 to 5.10.
>> The problem is that if "$1" is fed into "encode()" directly, $1 for
>> subsequent pattern matching is not updated.  Here is a small code to produce this.
>>
>>
>> ----
>> #!/usr/bin/perl
>> use MIME::Base64;
>> use Encode;
>> my $e_utf16 = find_encoding("UTF-16BE");
>>
>> $str="abcあdef";
>> my $len=length($str);
>> $re1=qr/(?:[a-z])/;
>> $re2=qr/(?:[^a-z])/;
>> $byte='';
>> pos($str) = 0;
>> while (pos($str) < $len) {
>>        #print pos($str),"\n";
>>        print "\n",pos($str),":",$str,"[$1]\n";
>>        if ($str =~ /\G($re1+)/cg) {
>>                print "($1)";
>>                $bytes .= $1;
>>        } elsif ($str =~ /\G($re2+)/cg) {
>>                my $base64 = encode_base64($e_utf16->encode($1), '');
>>                print "<$1:$base64>";
>>                $bytes .= $1;
>>        } else {
>>                die "aaa";
>>        }
>> }

If i used Devel::Peek to have a look at $1 after each regex match i
find this: (first output is blead, second is 5.8.6).

So for some reason $1 isnt marked as readonly in 5.10, and what i find
REALLY interesting is the last Dump output from both.  In particular
we see this from 5.8.6:

SV = PVMG(0x1a76b2c) at 0x1a98d9c
  REFCNT = 1
  FLAGS = (GMG,SMG,READONLY,pPOK)
  IV = 0
  NV = 0
  PV = 0x1a9359c "?"\0
  CUR = 1
  LEN = 11
  MAGIC = 0x1a994c4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a98d90
    MG_LEN = 1
    MG_PTR = 0x1a803b4 "1"
(def)

and in blead:

SV = PVMG(0x1ac052c) at 0x1a4dab4
  REFCNT = 1
  FLAGS = (GMG,SMG,POK,pPOK,UTF8)
  IV = 0
  NV = 0
  PV = 0x1a60834 "?"\0 [UTF8 "?"]
  CUR = 1
  LEN = 12
  MAGIC = 0x1aa6bb4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a4dac4
    MG_LEN = 1
    MG_PTR = 0x1ab4a8c "1"
(?)

So, outside of the added UTF8 flag (wtf did that come from), and the
missing READONLY flag, the dump is basically the same. Which really
makes me scratch my head.

D:\dev\perl\ver\p4\win32>..\perl ..\encode_bug.pl

0:abc?def[]
SV = PVMG(0x1ac052c) at 0x1a4dab4
  REFCNT = 1
  FLAGS = (GMG,SMG)
  IV = 0
  NV = 0
  PV = 0x1a60834 "UCS"\0
  CUR = 3
  LEN = 12
  MAGIC = 0x1aa6bb4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a4dac4
    MG_LEN = 1
    MG_PTR = 0x1ab4a8c "1"
(abc)
3:abc?def[abc]
SV = PVMG(0x1ac052c) at 0x1a4dab4
  REFCNT = 1
  FLAGS = (GMG,SMG,pPOK)
  IV = 0
  NV = 0
  PV = 0x1a60834 "abc"\0
  CUR = 3
  LEN = 12
  MAGIC = 0x1aa6bb4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a4dac4
    MG_LEN = 1
    MG_PTR = 0x1ab4a8c "1"
<?:AD8=>
4:abc?def[?]
SV = PVMG(0x1ac052c) at 0x1a4dab4
  REFCNT = 1
  FLAGS = (GMG,SMG,POK,pPOK,UTF8)
  IV = 0
  NV = 0
  PV = 0x1a60834 "?"\0 [UTF8 "?"]
  CUR = 1
  LEN = 12
  MAGIC = 0x1aa6bb4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a4dac4
    MG_LEN = 1
    MG_PTR = 0x1ab4a8c "1"
(?)
D:\dev\perl\ver\p4\win32>perl ..\encode_bug.pl

0:abc?def[]
SV = PVMG(0x1a76b2c) at 0x1a98d9c
  REFCNT = 1
  FLAGS = (GMG,SMG,READONLY)
  IV = 0
  NV = 0
  PV = 0x1a9359c "UCS"\0
  CUR = 3
  LEN = 11
  MAGIC = 0x1a994c4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a98d90
    MG_LEN = 1
    MG_PTR = 0x1a803b4 "1"
(abc)
3:abc?def[abc]
SV = PVMG(0x1a76b2c) at 0x1a98d9c
  REFCNT = 1
  FLAGS = (GMG,SMG,READONLY,pPOK)
  IV = 0
  NV = 0
  PV = 0x1a9359c "abc"\0
  CUR = 3
  LEN = 11
  MAGIC = 0x1a994c4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a98d90
    MG_LEN = 1
    MG_PTR = 0x1a803b4 "1"
<?:AD8=>
4:abc?def[?]
SV = PVMG(0x1a76b2c) at 0x1a98d9c
  REFCNT = 1
  FLAGS = (GMG,SMG,READONLY,pPOK)
  IV = 0
  NV = 0
  PV = 0x1a9359c "?"\0
  CUR = 1
  LEN = 11
  MAGIC = 0x1a994c4
    MG_VIRTUAL = &PL_vtbl_sv
    MG_TYPE = PERL_MAGIC_sv(\0)
    MG_OBJ = 0x1a98d90
    MG_LEN = 1
    MG_PTR = 0x1a803b4 "1"
(def)

-- 
perl -Mre=debug -e "/just|another|perl|hacker/"

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About