perl.perl4lib

perl.perl4lib https://www.nntp.perl.org/group/perl.perl4lib/ ... Copyright 1998-2025 perl.org Sun, 27 Apr 2025 21:23:29 +0000 ask@perl.org clustering module (2 messages) Greetings, which module you would suggest one should use, in order to be able to create clusters of the contents of a tag? Mostly i guess one should use the key collision fingerprint method... I plan to use https://metacpan.org/pod/Text::Fingerprint but i would appreciate if anyone has a better idea, thank you https://www.nntp.perl.org/group/perl.perl4lib/2018/05/msg3193.html Sat, 19 May 2018 13:03:31 +0000 Converting XML to MARC without reading a file? (12 messages) Hi - I'm pulling records from the WorldCat Search API in MARCXML, and need to convert them to binary MARC for further evaluation, which I'll do via MARC::Record. Problem: Converting from MARCXML via MARC::File::XML seems to require reading the records from a file. I've already got the XML stored in a variable, retrieved via LWP::Simple->get(). Do I have to write the XML to a file, then read it in again to convert it? Or am I just missing something obvious? I've tried things like: $xml = get($api_call); # also verified that $xml now contains MARCXML for 1 or more records my $batch = MARC::File::XML->in($xml); while (my $record = $batch->next()) { print $record; } but I get the error: Can't call method "next" on an undefined value Thanks --Andy https://www.nntp.perl.org/group/perl.perl4lib/2018/04/msg3181.html Thu, 19 Apr 2018 00:25:28 +0000 create a script out of many perl commands (1 message) perl script_one.pl file.txt perl script_two.pl perl script_three.pl command file_one.log file_two.log a perl one liner for example perl -lpe 's/\s*$//' How could i combine all the above in a bash script? i work on an ubuntu 16.04 machine I have tried with the && in linux, but i got errors Thank you in advance for any help https://www.nntp.perl.org/group/perl.perl4lib/2017/07/msg3180.html Tue, 18 Jul 2017 09:00:53 +0000 MARC::Record 2.0.7 and MARC::File::XML 1.0.5 released (1 message) Hi, I have uploaded MARC::Record 2.0.7 and MARC::File::XML 1.0.5 to CPAN. Both are minor bugfix and packaging update releases. Here are the changes to MARC::Record: 2.0.7 Tue May 23 20:41:13 EDT 2017 [FIXES] - RT#108123: clean up MANIFEST.SKIP - GH#1: marcdump now prints warnings (Johann Rolschewski) - remove a reference to SourceForge - fix a reference to the per4lib mailing list And to MARC::File::XML (note that 1.0.4 is not publicly indexed in CPAN due to a change in how PAUSE validates distribution names): 1.0.5 Tue May 23 21:24:18 EDT 2017 - adjust name of distribution to avoid colliding with MARC-XML, as PAUSE now enforces distribution name matching more strictly 1.0.4 Tue May 23 21:05:58 EDT 2017 (unreleased) - RT#111473: fix warning upon reaching end of XML stream (Johann Rolschewski) - remove extraneous diag from a test script (Florian Schlichting) Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2017/05/msg3179.html Wed, 24 May 2017 01:25:22 +0000 Job announcement - Systems Librarian (1 message) Skidmore College seeks a creative, user-oriented Systems Librarian to oversee library tools and systems that support collection maintenance and use, as well as related library and resource sharing services. The Systems Librarian will take a leadership role in implementing, maintaining, supporting, and enhancing a wide range of technologies and systems, and, in collaboration with key library staff and other campus partners, will investigate methods and best practices for assessing collections, patron experience, and library effectiveness. The College is particularly interested in candidates from underrepresented backgrounds and candidates who have had experience working with students from underserved populations. Based in the Lucy Scribner Library and reporting to the College Librarian, the Systems Librarian holds a 12-month contract and, as a non-tenured faculty member, participates in shared governance at the college. Scribner Library is looking for someone who is able to develop and administer a comprehensive technology plan that guides the library's adoption of next generation library systems. Projects on the horizon include transitioning to a new integrated library system, enhancing or replacing our digital collections platform, improving our library website, and improving or replacing our discovery services. Scope of Duties ? Administer, maintain, and optimize library systems related to both in-house and consortial activities including, but not limited to, the integrated library system, discovery layer, electronic resource management tool, digital collections platform, interlibrary loan applications, institutional repository, and other resource sharing tools and platforms ? Ensure accessibility to and stability of the library's online information resources and digital collections. Provide technical expertise for problem resolution ? Oversee the functioning and maintenance of the library website, and provide leadership for future upgrades ? Participate in analysis and assessment in order to determine patron satisfaction with library collections, services, and consortial arrangements ? Collaborate with consortial partners on both an ongoing and as-needed basis (e.g., ConnectNY, New York 6, Eastern Academic Scholars' Trust) ? Perform regular periodic data analysis and migration between library systems. Assist and guide others to create and use effective reports for analysis ? Develop, maintain, and document programs and scripts that extend system functionality and automate routine tasks ? Collaborate closely with IT and library staff to ensure seamless service to users ? Participate in reference and instruction/departmental liaison activities in line with the successful candidate's strengths, interests, and institutional needs ? Demonstrate professional engagement and scholarship required for advancement through library faculty ranks Required Qualifications ALA-accredited MLS/MLIS, or equivalent education and experience. Advanced knowledge of emerging technologies and their impact on academic libraries. Demonstrated proficiency in at least one programming language, preferably Perl, experience with SQL, and experience working in a command line environment. Working knowledge of an integrated library system and discovery service. Practical familiarity with cataloging and metadata structures and relevant tools for data manipulation. Excellent analytical, organizational, and project management skills. Adeptness at problem-solving, and disposition to share technical knowledge with others. Ability to work as part of a team, as well as independently and flexibly in a changing environment. Highly effective communication and interpersonal skills Desired Qualifications Two years' experience supporting an integrated library system, platforms, and software in an academic library setting. Familiarity with Voyager, Ebsco Discovery, WorldShare Management Services, WorldCat, ILLiad, Ares, EZProxy, CONTENTdm, BePress Digital Commons, authority control (Library Technologies, Inc.), Joomla. Experience transitioning major library systems. Review of applications begins May 1st and will continue until the position is filled. To learn more about and apply for this position please visit us online at: https://careers.skidmore.edu/applicants/Central?quickFind=58025 Skidmore College is committed to being an inclusive campus community and, as an Equal Opportunity Employer, does not discriminate in its hiring or employment practices on the basis of race, color, creed, religion, gender, age, national or ethnic origin, physical or mental disability, military or veteran status, marital status, sex, sexual orientation, gender identity or expression, genetic information, predisposition or carrier status, domestic violence victim status, familial status, dating violence, or stalking, or any other category protected by applicable federal, state or local laws. Employment at Skidmore College is contingent upon an acceptable background check result. CREATIVE THOUGHT MATTERS. Yvonne Kester Library Systems Analyst Lucy Scribner Library Skidmore College 518-580-5518 pronouns: she/her/hers why it matters: http://sites.miis.edu/cacsresources/2016/02/29/gender-pronouns-and-a-young-womans-career/ https://www.nntp.perl.org/group/perl.perl4lib/2017/04/msg3178.html Tue, 25 Apr 2017 14:21:02 +0000 Re: split a huge json file in separate per object files (1 message) I haven't actually used the module, but I'd at least take a look at JSON::Path. It's like XPath, but for JSON. I *have* used JSONpath before, just not in Perl. If you're interested in JSONPath, check out http://goessner.net/articles/JsonPath/. On Fri, Mar 10, 2017 at 8:57 AM, Marios lyberak <marios.lyberak@gmail.com> wrote: > Hello community, > > i have a json file, with a structure like this: > > { > "106" : { > "id54011" : [ > { > "partno1" : "16690617" > }, > { > "partno2" : "5899180" > } > ], > "parts" : [ > "0899180", > "16920617" > ], > "id5632" : [ > { > "partno1" : "090699180" > } > ] > }, > "560" : { > "id9452" : [ > { > "partno2" : "1569855" > } > ], > "parts" : [ > "03653624", > "15899855" > ], > "id578" : [ > { > "partno3" : "0366393624" > }, > { > "partno4" : "0363213624" > } > ] > } > } > I need to split this json, into files, like this: > > each json file, will consist of one object. 000106.json, and 000560.json. > (all names, must gave 6 digits, so zeros must be added) > > I have tried to use eval for this, but no luck up to now... > > expected output: json file 1, named 000106.json: > > { > "106" : { > "id54011" : [ > { > "partno1" : "16690617" > }, > { > "partno2" : "5899180" > } > ], > "parts" : [ > "0899180", > "16920617" > ], > "id5632" : [ > { > "partno1" : "090699180" > } > ] > } > and json file 2, named 000560.json: > > { > "560" : { > "id9452" : [ > { > "partno2" : "1569855" > } > ], > "parts" : [ > "03653624", > "15899855" > ], > "id578" : [ > { > "partno3" : "0366393624" > }, > { > "partno4" : "0363213624" > } > ] > } > How would you handle this? > -- *Justin Rittenhouse* *Senior Application Development Technician, **Information Technology* *Hesburgh Libraries* 208 Hesburgh Library *o:* 574-631-3065 *e: *jrittenh@nd.edu <http://library.nd.edu/> https://www.nntp.perl.org/group/perl.perl4lib/2017/04/msg3177.html Thu, 13 Apr 2017 13:40:51 +0000 Getting started with Z39.50 (4 messages) Hello all....I haven't posted here for a long time, but have been doing lots of interesting stuff with MARC/Perl.... I would like to know an easy way to get started with Z39.50. (For example, how to get MARC records from the LC, NLM, etc. servers)Anyone have some program segments they would be willing to share? Thanks for your time and help.  ===Charles P. Hobbs cph1776@yahoo.com Author of _Hidden History of Transportation in Los Angeles_ (History Press) http://www.morethanredcars.com https://www.nntp.perl.org/group/perl.perl4lib/2017/04/msg3173.html Thu, 06 Apr 2017 20:44:38 +0000 split a huge json file in separate per object files (2 messages) Hello community, i have a json file, with a structure like this: { "106" : { "id54011" : [ { "partno1" : "16690617" }, { "partno2" : "5899180" } ], "parts" : [ "0899180", "16920617" ], "id5632" : [ { "partno1" : "090699180" } ] }, "560" : { "id9452" : [ { "partno2" : "1569855" } ], "parts" : [ "03653624", "15899855" ], "id578" : [ { "partno3" : "0366393624" }, { "partno4" : "0363213624" } ] } } I need to split this json, into files, like this: each json file, will consist of one object. 000106.json, and 000560.json. (all names, must gave 6 digits, so zeros must be added) I have tried to use eval for this, but no luck up to now... expected output: json file 1, named 000106.json: { "106" : { "id54011" : [ { "partno1" : "16690617" }, { "partno2" : "5899180" } ], "parts" : [ "0899180", "16920617" ], "id5632" : [ { "partno1" : "090699180" } ] } and json file 2, named 000560.json: { "560" : { "id9452" : [ { "partno2" : "1569855" } ], "parts" : [ "03653624", "15899855" ], "id578" : [ { "partno3" : "0366393624" }, { "partno4" : "0363213624" } ] } How would you handle this? https://www.nntp.perl.org/group/perl.perl4lib/2017/03/msg3171.html Fri, 10 Mar 2017 13:57:56 +0000 identify ISSN numbers in an mrc file (6 messages) Hello community, how would you treat the following? I need a way to identify all tags - subfields, that have stored an ISSN number in them. What would you suggest as a clever approach for this? Thank you https://www.nntp.perl.org/group/perl.perl4lib/2016/11/msg3165.html Wed, 02 Nov 2016 08:57:20 +0000 creating an index of files contents (2 messages) hello community, Say we have the following structure in our filesystem: dir1 dir2 dir3 dir4 dir stands for directory of course. In dir1, there is a file1.txt that has in it numbers, like below 6576576 898798789 5645436549 76567576576 876876876876 Same goes for dir2. In dir2, there is a file2.txt, that has in it numbers, like below 6576576 89879878963 56454365492 765675765763 8768768768765 And so with all the rest of the folders. What we need to do, is have a new file (like an index) out of all directories and files values, like below: dir1;6576576,898798789,5645436549,76567576576,876876876876 dir2;6576576,89879878963,56454365492,765675765763,8768768768765 And secondly, another index file, which will have the reverse info 6576576;dir1,dir2 Any ideas on how would you approach this? Best https://www.nntp.perl.org/group/perl.perl4lib/2016/09/msg3163.html Fri, 23 Sep 2016 06:49:03 +0000 scripts imported into dancer (2 messages) Hello great minds, not a perl hacker, so please bare with my questions... i am trying to make my work served via Dancer. I have many scripts, that i run in the console: perl somecommand filetobeprocessed.mrc and then in the folder i ran the command, i get the result file. My question is how i could improve this process, a) how i could combine all scripts that take as input the same input file into a Dancer application. When the last script finishes with creating the last output file, then i guess Dancer will redirect to another page, where one could download all output (produced) files. b) in order to know when all scripts are finished, i need to use promises? I would love to hear your pro approach, and if possible, if you could point me to a prototype dancer web app that works like this... c) last, if you had any suggestions, on how i should arrange the folders in this web app, for example where it would be best to keep the input mrc files, where to keep the code, where to store the output files... Looking forward to your collective wisdom answers, Cheers https://www.nntp.perl.org/group/perl.perl4lib/2016/09/msg3161.html Sat, 10 Sep 2016 13:41:05 +0000 Save the date: Mashcat meeting in Atlanta, Georgia, 24 January 2017 (1 message) We are excited to announce that the second face-to-face Mashcat event in North America will be held on January 24th, 2017, in downtown Atlanta, Georgia, USA. We invite you to save the date. We will be sending out a call for session proposals and opening up registration in the late summer and early fall. Not sure what Mashcat is? "Mashcat" was originally an event in the UK in 2012 aimed at bringing together people working on the IT systems side of libraries with those working in cataloguing and metadata. Four years later, Mashcat is a loose group of metadata specialists, cataloguers, developers and anyone else with an interest in how metadata in and around libraries can be created, manipulated, used and re-used by computers and software. The aim is to work together and bridge the communications gap that has sometimes gotten in the way of building the best tools we possibly can to manage library data. Among our accomplishments in 2016 was holding the first North American face-to-face event in Boston in January and running webinars. If you're unable to attend a face-to-face meeting, we will be holding at least one more webinar in 2016. To learn more about Mashcat, visit http://mashcat.info. Thanks for considering, and we hope to see you in January. -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2016/07/msg3160.html Tue, 12 Jul 2016 22:41:59 +0000 MARC::Spec and Catmandu::Fix::MARCspec (1 message) Dear list members! I like to announce the release of my first two Perl projects: · MARC::Spec [1] parses a MARCspec [2] as string or can be used to build MARCspec objects. · Catmandu::Fix::marc_spec [3] is a MARCspec interpreter for Catmandu [4]. Any comments and questions are welcome! Additional tests would be nice. Please report bugs at [5] and [6]. Cheers! Carsten [1] MARC::Spec - A MARCspec parser and builder <https://metacpan.org/pod/MARC::Spec> [2] MARCspec - A common MARC record path language <http://marcspec.github.io/MARCspec/> [3] Catmandu::Fix::marc_spec - reference MARC values via MARCspec - A common MARC record path language <https://metacpan.org/pod/Catmandu::Fix::marc_spec> [4] Catmandu - the data processing toolkit http://librecat.org/ [5] <https://github.com/MARCspec/MARC-Spec/issues> [6] <https://github.com/cKlee/Catmandu-Fix-marc_spec/issues> _______________________________________________ Carsten Klee Abt. Überregionale Bibliographische Dienste IIE Staatsbibliothek zu Berlin - Preußischer Kulturbesitz Potsdamer Straße 33 10785 Berlin Fon: +49 30 266-43 44 02 Fax: +49 30 266-33 40 01 carsten.klee@sbb.spk-berlin.de www.zeitschriftendatenbank.de https://www.nntp.perl.org/group/perl.perl4lib/2016/07/msg3159.html Tue, 12 Jul 2016 07:27:10 +0000 identify encoding from a file (4 messages) <?xml version="1.0" encoding="UTF-8"?> <TABLE NAME="LIBGROUP.DB"> <FIELDS COUNT="8"> <FIELD TYPE="ftSmallint" SIZE="2">GroupID</FIELD> <FIELD TYPE="ftString" SIZE="51">GroupName</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">Borrow</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">BorrowDuration</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">Reserve</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">ReserveDuration</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">Prolong</FIELD> <FIELD TYPE="ftSmallint" SIZE="2">ProlongDuration</FIELD> </FIELDS> <INDICES COUNT="2"> <PRIMARY> <FIELD>GroupID</FIELD> </PRIMARY> <INDEX NAME="libgroup-gropuname-idx"> <FIELD>GroupName</FIELD> </INDEX> </INDICES> <DATA RECORDS="6"> <RECORD ID="1"> <GROUPID>2</GROUPID> <GROUPNAME>Unlimited</GROUPNAME> <BORROW>-1</BORROW> <BORROWDURATION>-1</BORROWDURATION> <RESERVE>-1</RESERVE> <RESERVEDURATION>-1</RESERVEDURATION> <PROLONG>-1</PROLONG> <PROLONGDURATION>-1</PROLONGDURATION> </RECORD> <RECORD ID="2"> <GROUPID>3</GROUPID> <GROUPNAME>Typical</GROUPNAME> <BORROW>5</BORROW> <BORROWDURATION>2</BORROWDURATION> <RESERVE>3</RESERVE> <RESERVEDURATION>1</RESERVEDURATION> <PROLONG>5</PROLONG> <PROLONGDURATION>3</PROLONGDURATION> </RECORD> <RECORD ID="3"> <GROUPID>4</GROUPID> <GROUPNAME>Can't charge</GROUPNAME> <BORROW>0</BORROW> <BORROWDURATION>0</BORROWDURATION> <RESERVE>5</RESERVE> <RESERVEDURATION>5</RESERVEDURATION> <PROLONG>0</PROLONG> <PROLONGDURATION>0</PROLONGDURATION> </RECORD> <RECORD ID="4"> <GROUPID>5</GROUPID> <GROUPNAME>Can't reserve</GROUPNAME> <BORROW>5</BORROW> <BORROWDURATION>15</BORROWDURATION> <RESERVE>0</RESERVE> <RESERVEDURATION>0</RESERVEDURATION> <PROLONG>5</PROLONG> <PROLONGDURATION>15</PROLONGDURATION> </RECORD> <RECORD ID="5"> <GROUPID>7</GROUPID> <GROUPNAME>̡觴ݲ</GROUPNAME> <BORROW>5</BORROW> <BORROWDURATION>10</BORROWDURATION> <RESERVE>1</RESERVE> <RESERVEDURATION>3</RESERVEDURATION> <PROLONG>1</PROLONG> <PROLONGDURATION>5</PROLONGDURATION> </RECORD> <RECORD ID="6"> <GROUPID>8</GROUPID> <GROUPNAME>ʡ解紝</GROUPNAME> <BORROW>5</BORROW> <BORROWDURATION>10</BORROWDURATION> <RESERVE>2</RESERVE> <RESERVEDURATION>3</RESERVEDURATION> <PROLONG>2</PROLONG> <PROLONGDURATION>10</PROLONGDURATION> </RECORD> </DATA> </TABLE> https://www.nntp.perl.org/group/perl.perl4lib/2016/02/msg3155.html Sat, 06 Feb 2016 12:39:50 +0000 MARC/Perl moved to GitHub; GitHub perl4lib organization (1 message) Hi, I have moved the main repository for MARC/Perl [1] from SourceForge to GitHub; it can now be found at https://github.com/perl4lib/marc-perl In the process of doing this, I have created a "perl4lib" organization in GitHub. Anybody who maintains a project coded in Perl that is relevant to libraries is welcome to use this organization as a "home" for their Git repositories. I am also seeking folks who are willing to act as co-owners of the organization; at present the owners are: * Galen Charlton * Francis Kayiwa If you are interested, please get in touch with me. [1] I.e., MARC::Record, MARC::File::XML, MARC::Lint, MARC::Charset, and MARC::File::MiJ Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2016/01/msg3154.html Mon, 18 Jan 2016 16:40:48 +0000 Call for testers: next release of Net::OAI::Harvester Perl modulemay break legacy custom handlers (1 message) -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 [Apologies for cross-posting] The Perl module Net::OAI::Harvester implements a client framework for the OAI Protocol for Metadata Harvesting (OAI-PMH) and was authored and originally maintained by Ed Summers. It has been available on CPAN ever since 2003 and its last stable version 1.15 has been released almost four years ago: < http://search.cpan.org/~thb/OAI-Harvester-1.15/ >. Since one of the repositories used for testing vanished from the web some time ago and this is breaking the test suite a new version has to be released fairly soon. Over time I had been tackling various minor issues and published developer releases on CPAN, cf. the list at the end of this mail or the Changes document linked at the CPAN page for the current release 1.16_12: < http://search.cpan.org/~thb/OAI-Harvester-1.16_12/ > However the sum of these changes is not negligible and specifically their impact on "custom metadata handlers" (which are to be used when processing other metadata formats than oai_dc) may affect applications using the module: >>>>> Up to version 1.15 the metadataHandler was inconsistently fed with input : - GetRecord exposed the almost complete XML response to the Handler (including start_document/end_document events) - ListRecords exposed the (OAI)record element (header, metadata and optional about containers) but did not propagate start_document or end_document events. In both cases the events for the header tags itself and for the optional setSpec subelements had not been forwarded Version 1.20 introduces a modified behavior for metadataHandler and an additional recordHandler: - a metadataHandler will see only the (single) subelement of the OAI metadata element (so for an deleted record it might never be invoced at all) - a recordHandler will see the OAI record element and its subelements Therefore a metadataHandler will now be confined to the metadata fragment(s) of the response, and the new recordHandler approximates the old behavior of ListRecords, however OAI-PMH:identifier and OAI-PMH:datestamp will now be properly encapsulated within their OAI-PMH:header element. Additionally, two new methods responseDate() and request() allow access to the corresponding top-level OAI-PMH elements in all response types. A SAX filter of class Net::OAI::Record::DocumentHelper may be used to inject start_document and end_document events into the chain if they are needed. As a temporary measure, you may set $Net::OAI::Harvester::OLDmetadataHandler =1 to change the behavior of handlers passed as "metadataHandler" into that of a recordHandler. <<<<<< Obviously the change of semantics for a metadataHandler to deal with the "metadata" elements of the response instead of the "record" elements is a design decision and may be questioned by users of the module. The current version also contains several changes which solve deficiencies of Net::OAI::Harvester 1.15 but possibly break existing workarounds for these deficiencies. For example officially (per documentation) you never could acccess the responseDate of the OAI-PMH result, but due to a sloppy implementation of processing for the identify verb it was possible to extract it in this case by an undocumented method. The current version supplies a dedicated responseDate() accessor for all verbs but at the same time fixes the behavior in the identify case. I may be overly optimistic but my impression is that the changes between the current 1.15 and the coming version (most probably numbered 1.20) do actually fix many issues but the fear is realistic (I experienced that myself with an old application of mine using the module) that these fixes may conflict with workarounds introduced by users to make things work before. *** So please, if you are currently using Net::OAI::Harvester *and* had been forced to introduce workarounds or tweak internals of the module, perform thorough testing before upgrading to the coming stable version, preferably already now with the developer version 1.16_12. And, please, please: provide feedback if you should run into trouble, either via the CPAN request tracker for the module at < https://rt.cpan.org/Public/Dist/Display.html?Name=OAI-Harvester > or by direct mail. Sorry for the inconvenience viele Gruesse Thomas Berger Changes to Net::OAI::Harvester since version 1.15 1.16_12 Tue, Jan 12 00:20:05 CET 2016 - - dealing with CPANTS Kwalitee issues, esp. version number mess - - new filter class Net::OAI::Record::DocumentHelper for tweaking 1.16_11 Tue, Jan 12 00:20:05 CET 2016 - - minor cleanup 1.16_10 Mon, Jan 11 01:29:46 CET 2016 - - renamed alldata() method for accessing recordHandler results to recorddata() - - better propagation of namespace prefix mapping events - - Net::OAI::NamespaceFilter with a result() method - - Net::OAI::NamespaceFilter tested with XML::SAX::Writer - - AUTHOR formatting 1.16_09 Sun, Feb 14 17:29:39 CET 2014 - - Net::OAI::NamespaceFilter as kind of generic metadata handler - - Queries are now constructed basing on a copy of the Harvester's baseURL - - pass parameters to URI->query_form() more reproducably, (esp. "verb" should now always be first to accommodate some allegedly broken repositories) - - temporary? tests for correctness of LWP operations 1.16_07 Tue, Apr 30 01:26:40 CEST 2013 - - added new methods: response(), responseDate(), error() - - Smoke still tests failed on 'Bad Host' tests (wrong error codes induced by HTTP proxies?) - - aligned behavior of metadataHandler for listRecords() and getRecord() - - introducing alternative recordHandler for listRecords() and getRecord() - - removed erroneous resumptionToken handling for identify() 1.16_04 Fri Dec 7 09:49:03 CET 2012 - - consider HTTP proxies in design of t/003.error.t - - 'Bad Host' tests failing b/c error code 500 is not the expected code 404 (due to some recent change in LWP)? 1.16_01 Mon Apr 2 23:14:35 CEST 2012 - - Modules were not namespace aware. - - Add HTTPRetryAfter() method (catches HTTP Retry-After header) - - Check responses for Content-Type and charset before parsing - - Net::OAI::Header handed up (empty) header elements and other stuff to the request's metadataHandler - - SKIP tests when HTTP errors are encountered -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iJwEAQECAAYFAlaY2iYACgkQYhMlmJ6W47MMCwP/Yhij11TfEL1dfYtimdXG8hkf FYLvvXECzECPxKbHIC0dKvf5v4myW8oedlK3B+oOzIjjOY60pT7pdC4KB/xgU+a1 N1djewSgT4hJ3IoacmUkLpnh81NSM1oA0osw48qVco4qpxDOY2HrR3bdBZksKBcI lQH10kIYqo/TZYGXHYQ= =v03A -----END PGP SIGNATURE----- https://www.nntp.perl.org/group/perl.perl4lib/2016/01/msg3153.html Fri, 15 Jan 2016 11:38:32 +0000 Opening & writing to UTF-8 files; copyright symbol again --solution (6 messages) I should probably say, "apparent solution" 'cause character set issues never seem to end. However, combining Jon Gorman's recommendation with some Googling, I get: my $outfile='4788022.edited.bib'; open (my $output_marc, '>', $outfile) or die "Couldn't open file $!" ; binmode($output_marc, ':utf8'); The open statement may not be quite correct, as I am not familiar with the more current techniques for opening file handles that John mentioned. However, when I use those instructions to open the output file rather than what I had before, the copyright symbol does indeed come across as C2 A9 as it was in the original record. I didn't want to use the utf8, because I've tried that before and ended up with double-encoding (and a real mess). But I'll continue testing. The results of the googling I referred to can be found at: https://groups.google.com/forum/#!topic/perl.perl4lib/sy7hqiBQ1yM Anne L. Highsmith Director, Consortia Systems TAMU Libraries 5000 TAMU College Station, TX 77843-5000 979 862 4234 hismith@tamu.edu https://www.nntp.perl.org/group/perl.perl4lib/2015/11/msg3147.html Fri, 13 Nov 2015 22:05:13 +0000 Opening & writing to UTF-8 files; copyright symbol again (3 messages) This is related to my previous post (9/17/2015) about deleting 035 fields after RDA-ification. Jon Gorman solved that one for me by pointing out that I probably had a problem with my perl libraries. But now, instead of creating the record from the database and writing it back to the database, I am reading from a file exported from my database, which is UTF-8. Specifically, the blasted copyright symbol again. As stored in the database, the copyright symbol is encoded as C2 A9, which if I read the tables correctly, is the correct UTF-8 encoding for copyright. But when I read the record from a file and write it back to the file after deleting the problematic 035, the encoding for the copyright symbol has been turned into A9. This "transformation" happens both when running the perl program on my pc and on the unix server. Interestingly, complicated Unicode seems to be okay. I took a record with Hebrew vernacular characters and edited it using my program, then ran the source record and target record through xxd. I then diffed the files; it showed no difference. But the before and after of the record that has the copyright symbol munges the copyright by stripping the C2. Here's the program. If anybody can tell my what I'm doing wrong I'd really appreciate it. ---------------------------------------------------------------------------------------------------------- use strict; use warnings; use MARC::Record; use MARC::Batch; my $infile='4788022.bib'; my $batch = MARC::Batch->new('USMARC',"$infile"); my $outfile='4788022.edited.bib'; open(OUTPUT, ">$outfile"); while (my $record = $batch->next) { my $f001 = $record->field('001'); my $bib_id = $f001->as_string(); my @a035 = $record->field('035'); foreach my $f035 (@a035) { if (my $f035a = $f035->subfield('a')) { if ($f035a eq $bib_id) { $record->delete_field($f035); } } } print OUTPUT $record->as_usmarc(); } Anne L. Highsmith Director, Consortia Systems TAMU Libraries 5000 TAMU College Station, TX 77843-5000 979 862 4234 hismith@tamu.edu https://www.nntp.perl.org/group/perl.perl4lib/2015/11/msg3144.html Fri, 13 Nov 2015 20:01:43 +0000 Editing marc records; program exits with Encode.pm error oncopyright symbol (2 messages) https://www.nntp.perl.org/group/perl.perl4lib/2015/09/msg3142.html Thu, 17 Sep 2015 18:53:45 +0000 script help list all files in folders and subfolders (2 messages) Hello friendly folks, i would appreciate any help on the following: say we have a folder with thousands of html files. Since the file browser crashes, i am looking at making a script that would do the following: Distribute all html files in folders, say 001, 002, 003, etc, sorted by the html files' names. Then, i would like to create an index also an html file, that would exist in each folder, containing links to all the folder's stored html files. Also, if possible, a general index in the parent folder, that would allow for a tree like showing the directories as links, and in a tree like representation, below the links of the files contained in each folder. Thank you in advance p.s. we use Perl 5.18,in an ubuntu 14.04 server https://www.nntp.perl.org/group/perl.perl4lib/2015/03/msg3140.html Tue, 31 Mar 2015 07:59:23 +0000 Options for translating languages within perl scripts (3 messages) Hi, I've been tasked with massaging a large batch of French-language MARC records from a vendor. Aside from the usual MARC field manipulation/cleanup we usually do with perl, I've been asked to run the 520 field through a translation routine/API, etc. to convert (possibly crudely) from French to English. I thought that Babelfish or http://api.yandex.com/translate/doc/dg/reference/translate.xml might be options, but Babelfish appears to be dead, and when I clicked to get the required key for the yandex API, the link led to a dead end. Is anyone incorporating POST queries or other methods to translate fields in MARC records? I'd appreciate any leads or pointers. Thanks in advance, Eileen Pinto Library Systems Office University of California, Berkeley Berkeley, CA 94720-6000 https://www.nntp.perl.org/group/perl.perl4lib/2015/02/msg3137.html Thu, 26 Feb 2015 22:29:24 +0000 UNICODE character identification (7 messages) Hello friendly folks, follows what i am trying to do, and i am looking for your help in order to find the most clever way to achieve this: We have records, that include typos like this: we have a word say Plato, where the last o is inputted with the keyboard set to Greek language, so we need something that would parse all metadata in a per character basis, check against what is the script language that the majority of characters the word belongs to have, and return the odd characters, the script they belong, and the record identifier they were found in, so as to be able to correct them thank you in advance https://www.nntp.perl.org/group/perl.perl4lib/2015/02/msg3130.html Tue, 10 Feb 2015 12:27:03 +0000 UNIMARC example file? (4 messages) hello everybody, could someone provide me with an example UNIMARC iso2709 file in order to test a module? Thank you in advance https://www.nntp.perl.org/group/perl.perl4lib/2014/12/msg3126.html Thu, 11 Dec 2014 10:08:48 +0000 send emails via perl (12 messages) hello, we need to use the easiest solution, if possible just use a perl module, to be able to send automated emails on an Ubuntu server. The scenario is this: we ran a cron job, and say we would like to send a message after completion, to a certain for example gmail account. The ideal would be to not use any mailer, is this possible? Or could you please suggest us the best - easiest approach? Thank you https://www.nntp.perl.org/group/perl.perl4lib/2014/11/msg3114.html Wed, 19 Nov 2014 08:30:56 +0000 New version of MARC::Lint available (1 message) I have posted a new version of MARC::Lint to CPAN [1]. This version applies the changes found in MARC 21 updates 17 [2] and 18 [3]. [1] http://search.cpan.org/~eijabb/MARC-Lint_1.48/ [2] http://www.loc.gov/marc/up17bibliographic/bdapndxg.html [3] http://www.loc.gov/marc/bibliographic/bdapndxg.html Thank you for your time. Bryan Baldus bryan.baldus@quality-books.com eijabb@cpan.org http://home.comcast.net/~eijabb/ https://www.nntp.perl.org/group/perl.perl4lib/2014/07/msg3110.html Mon, 21 Jul 2014 13:39:13 +0000 Finding non-unicode characters (2 messages) Can someone suggest a way to identify if a MARC record, coded at LDR/09 = âaâ has non-unicode characters in it? I tried the following, kind of grasping at straws, against a record that I know has non-unicode characters. It didnât report any errors. # $bib_id is defined as 001 field my $bib_marc = [subroutine defined elsewhere to get a marc record string]; eval { $bib_rec = MARC::Record->new_from_usmarc($bib_marc); } ; if ($@) { print ERRORS "$bib_id\t$@\n"; next; } We have a group of records in our database that are mostly Unicode but have some erroneous characters. Iâd like to have a script to run against them to see if theyâve been completely cleaned up after the catalogers work on them. Anne L. Highsmith Director of Consortia Systems Texas A&M University 5000 TAMU College Station, TX 77843-5000 Phone: 979 862 4234 Fax: 979 845 6238 Email: hismith@tamu.edu https://www.nntp.perl.org/group/perl.perl4lib/2014/06/msg3108.html Mon, 30 Jun 2014 14:51:13 +0000 Converting MARC fields with Catmandu - repeated subfields beingsquished together. (10 messages) I'm using catmandu to JSON-ise MARC records for storage in Elasticsearch, and seem to have come up with something that I can't readily see how to fix (without getting down and dirty with fixers.) I have a record that has this: ["650"," ","0","a","Time","v","Pictorial works","v","Juvenile literature.","9","15531"] and a mapping: marc_map('650v', 'subject.$append') This works well enough in most cases, however when the subfield is doubled up, I end up with: "subject":["Time","Pictorial worksJuvenile literature."] The $append doesn't seem to apply in this case. This only seems to happen to repeats within a field, other 650$v subfields are in their own strings, though suffer the same problem. Is this a bug in Catmandu-MARC? I've tried reading the marc_map.pl file, but the lack of internal documentation, and the nature of what it's doing make it not the easiest thing to understand. -- Robin Sheat Catalyst IT Ltd. â +64 4 803 2204 GPG: 5FA7 4B49 1E4D CAA4 4C38 8505 77F5 B724 F871 3BDF https://www.nntp.perl.org/group/perl.perl4lib/2014/06/msg3098.html Fri, 06 Jun 2014 03:11:40 +0000 sending marc records into a script that uses MARC::Batch (7 messages) Hello, Two questions please: 1. I've written a script that opens a marc file for reading using this syntax: $file = $ARGV[0]; $batch = MARC::Batch->new('USMARC',$file); It then loops thru the records using this syntax: while ( $record = $batch->next()) { .....check position 6, 7 of leader and position 23 of 008 and make some changes } This works great. However, instead of accessing the file this way, I want to pipe the output of a previously run marc dump command directly into this script via the pipe. I understand that this can be done using this syntax: while ($line =<STDIN>){ ...}, but I don't understand how to use that STDIN with "MARC::Batch->new('USMARC',$file);" This does not work: $batch = MARC::Batch->new('USMARC',<STDIN>); 2. My current script successfully reads and processes a marc file of over 5 gigs!....but exits entirely on record 160,585 with the error from MARC::Batch, "Can't call method "as_string" on an undefined value at ./marc_batch.pl". Documentation on using MARC::Batch says that to tell it to continue processing even when errors are encountered one should use strict_off(), then print/report warnings at the bottom of the script. I don't think my particular error is being handled by the strict_off() setting. Doesn't anybody know what causes/how to fix "Can't call method as_string?" error? Full script below-it's pretty short, thanks to MARC::Batch. Thanks for ensights! use MARC::Batch; $file = $ARGV[0]; chomp($file); $batch = MARC::Batch->new('USMARC',$file); $batch->strict_off(); # otherwise script exits when encounters errors open(OUT,'>new_marc'); while ( $record = $batch->next()) { $leader = $record->leader(); $leader_pos_6 = substr($leader,6,1); $leader_pos_7 = substr($leader,7,1); $field = $record->field('008'); $field_008 = $field->as_string(); $field_008_position_23 = substr($field_008,23,1); if ( ($leader_pos_6 eq "a") && ($leader_pos_7 eq "m") && ($field_008_position_23 eq "o") || ($field_008_position_23 eq "s") ) { $control_num = $record->field('001'); $control_num = $control_num->as_string(); print "008 position 23: $field_008_position_23 \n"; print "OLD leader: $leader \n"; $old_leader = $leader; substr($leader,6,1) = 'm'; print "NEW leader: $leader \n"; print OUT $record->as_usmarc(); print "$control_num|$old_leader|$leader|$field_008\n"; } else { # not a match so just print this one unchanged... print OUT $record->as_usmarc(); } } # handles errors: if (@warnings = $batch->warnings()) { print "\n Warnings detected: \n", @warnings; } close(OUT); close(LOG); John Guillory Louisiana Library Network 225.578.3758 https://www.nntp.perl.org/group/perl.perl4lib/2014/05/msg3091.html Thu, 29 May 2014 16:08:16 +0000 2 Bibliographic Software Engineering vacancies, University of Edinburgh (1 message) Hi folks, I hope you don't mind me bringing to your attention these two software engineering posts advertised at the University of Edinburgh in EDINA's Bibliographic and Multimedia Services team?: Software Engineer, Salary £30,728 to £36,661 http://edin.ac/1kAhdvZ Software Engineer, Salary £25,759 - £29,837 http://edin.ac/1kAhijw (If the links don't work, go to https://www.vacancies.ed.ac.uk/ and use references 026376 and 026399, respectively) Cheers, Ben From the job advertisements: Would you like to build online services to support higher and further education? We need a developer to help implement and maintain innovative virtual library services, delivered to higher education institutions across the UK. This is an excellent opportunity to join a talented and friendly group of software engineers, helping to develop our services further and contribute to new projects. We do object-oriented programming, keep our data in relational databases and search servers, and use web frameworks to design our interfaces. Most of us develop on Linux or Mac platforms, and our services are delivered from a mix of Red Hat Enterprise Linux and Enterprise Solaris servers. You should be a graduate in a computing-related discipline, or have relevant experience, with significant experience of web application development. Your skills will include object-oriented programming in a language such as Java or Perl. Knowledge of web services or machine-to-machine interfaces (e.g. SRU, OAI-PMH, REST), repository software (e.g. DSpace, Eprints), would be an advantage - but most important is initiative along with good analysis and problem-solving skills, so that you can react in an informed and creative way to problems and new requirements. Benefits of working at the University include flexible working, an excellent pension, career prospects and generous holiday provision. This post is on a fixed term basis for 2 years. Closing date: Thursday 20th March 2014 at 5pm https://www.nntp.perl.org/group/perl.perl4lib/2014/03/msg3090.html Thu, 06 Mar 2014 11:42:03 +0000 Basic questions -- skipping records that don't pass muster (2 messages) Through various programs, when I've been processing MARC records and found 1 that didn't pass muster (couldn't pass structural or character set requirements) I generally slogged through, found it, and fixed it. But now I have passed on some of my code to someone who doesn't have the time or the expertise to do that, so I'm asking for help on behalf of both of us. First, let's say I have a hash of record id numbers (voyager bib ids for those who speak voyager) And here's a code snippet: foreach my $bib (sort keys %list_of_docs_bibs) { # put the bits of the marc record back into a string marc string my $bib_marc = &get_bib_string($dbh, $dbase, $bib); # create a marc record object from marc record string my $bib_rec = MARC::Record->new_from_usmarc($bib_marc); #do useful stuff to $bib_rec } Occasionally, this code will hit a bib record that has an invalid tag and will blow up with something like: " Tag "`1`" is not a valid tag at ...USMARC.pm line 222" What is an appropriate way to print out that record to an error file and go on to the next record, rather than having the program blow up and stop? ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Second, related question. Although all of the records in the voyager database are SUPPOSED to be in Unicode, one occasionally encounters a record that has non-Unicode characters. So, pretty much same question -- What is an appropriate way to print out that record someplace and go on to the next record, rather than having the program blow up and stop? Suggestions gratefully appreciated Anne L. Highsmith Director of Consortia Systems Texas A&M University 5000 TAMU College Station, TX 77843-5000 Phone: 979 862 4234 Fax: 979 845 6238 Email: hismith@tamu.edu https://www.nntp.perl.org/group/perl.perl4lib/2014/02/msg3078.html Thu, 13 Feb 2014 21:20:19 +0000 MARC::File::XML 1.0.3 released (4 messages) Hi, I have uploaded MARC::File::XML 1.0.3 to CPAN. This release fixes a test failure on Strawberry Perl. Here is the changelog: 1.0.3 Thu Jan 23 05:02:52 UTC 2014 - fix t/external-entities.t to pass on Strawberry Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2014/01/msg3075.html Thu, 23 Jan 2014 05:14:58 +0000 SECURITY release: MARC::File::XML 1.0.2 (2 messages) Hi, I have uploaded [1] version 1.0.2 of MARC::File::XML. This is a security release that repairs an XML external entity (XXE) vulnerability. I recommend that all uses of MARC::File::XML upgrade promptly. Here is the change log entry: 1.0.2 Tue Jan 21 17:18:37 UTC 2014 - MARC::File::XML will now die upon parsing a record that declares an external entity and tries to use it. This prevents the potential unwanted disclosure of the contents of files on the server by applications that embed this module. If, for some reason, an application needs to process MARCXML records that contain external entities, set_parser() can be used to force the use of an XML::LibXML parser that is configured to process external entities. The issue was reported by John Lightsey. [1] https://metacpan.org/release/GMCHARLT/MARC-XML-1.0.2 Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2014/01/msg3073.html Tue, 21 Jan 2014 17:38:43 +0000 Re: AW: [librecat-dev] A common MARC record path language (10 messages) Hi Carsten Excuses for the late reply, it took some while to get the system booted after winter vacations. You are right in the discussion about which parts should be specified by a MARCspec language and which part should be implemented as operations on nodes found. I gave the examples not as a hit for the implementation language (e.g. if it requires regular expressions or not) but as a examples of MARC in the wild (non standard tags) and MARC combined with cataloging rules (where subfields and characters in front of a subfield have a special meaning). In daily work I often encounter mapping rules which involve these special subfield cases (âTake everything from the 245 until you hit the first / before a subfieldâ). These things canât be easily (can it) expressed in Xpath when using XSTL or MARCspec when using tools like Catmandu..but are very common and can be shared across tools. I think this would be candidates to formalise . Cheers Patrick On 06/01/14 16:33, "Klee, Carsten" <Carsten.Klee@sbb.spk-berlin.de> wrote: > >On the other hand I could imagine something like "100[0]" for the first >100 field (author) and "100[1]" for the second and so on. But what is >about repeatable subfields? Maybe someone requires the first subfield "a" >of the second 100 field. Besides the characters "[" and "]" are also >valid subfield codes (see [2]). > >With substrings it is more complicated. I only could imagine using >regular expressions. Maybe something like 245a[Å\s(.*)]_10. But for >usability reasons this might be better left to the applications. Isn't >there something in Catmandu like >marc_map('245','my.title', -substring-after => 'Å '); ?? > >Maybe you have another solution for that? > >Another issue I suspect with your last example under >https://metacpan.org/pod/Catmandu::Fix::marc_map > ># Copy all 100 subfields except the digits to the 'author' field >marc_map('100^0123456789','author'); > >In the current MARCspec this would be interpreted as "a reference to >subfields ^, 0, 1, 2, 3, 4, 5, 6, 7, 8 and 9 of field 100". This is >because "^" is a valid subfield code (see [2]). > >So far... I would be happy to read more comments on this. > >Cheers! > >Carsten > > >[1] <https://github.com/cKlee/marc-spec/issues> >[2] <http://www.loc.gov/marc/specifications/specrecstruc.html#varifields> >_______________________________________________ >Carsten Klee >Abt. Ãberregionale Bibliographische Dienste IIE >Staatsbibliothek zu Berlin â PreuÃischer Kulturbesitz > >Fon: +49 30 266-43 44 02 > >> -----UrsprÃ¼ngliche Nachricht----- >> Von: Patrick Hochstenbach [mailto:Patrick.Hochstenbach@UGent.be] >> Gesendet: Freitag, 20. Dezember 2013 14:06 >> An: voss@gbv.de; librecat-dev@mail.librecat.org; perl4lib@perl.org >> Cc: Klee, Carsten >> Betreff: Re: [librecat-dev] A common MARC record path language >> >> Hi >> >> Thanks for this initiative to formalise the path language for MARC >> records. In Catmandu our path language is better described at: >> https://metacpan.org/pod/Catmandu::Fix::marc_map. It would be an easy >>fix >> for us to follow CarstenÂ¹s MARC spec rules and I will gladly implement >>it >> for our community. >> >> We see these type of MARC paths in programming libraries such as the >> projects mentioned below but also in products like XSTL, SolrMarc, >> ILS-vendors who need them to define how to index marc, standardisation >> bodies like e.g. that provide mapping rules (e.g. >> http://www.loc.gov/standards/mods/mods-mapping.html). I tried to make a >> small roundup in the past of these projects but it would be great to >>have >> more extensive look at all current pratices. >> >> In our Catmandu project we found that Xpaths are too verbose for our >> librarians to interpret and in practise tied to XSLT-programming which >> requires quite some programming skills to read and interpret. >> >> Our paths are very much simplified but still seem to lack some things >>that >> are available in the MARC data model which would be great to have >> available in the MARCspec syntax: >> >> - Notion of pointing to the first item (first author) >> - Supporting local defined MARC (sub)fields (e.g. Ex Libris exports >> contain all kind of Z30, CAT , etc fields) >> - Support for pointing to a subfields that follow a specific character >> (e.g. In titles I would like to point to everything after the Å/Å in a >>245 >> field). >> >> Cheers and have a nice holiday >> >> Patrick >> >> >> On 19/12/13 13:16, "Jakob VoÃ" <voss@gbv.de> wrote: >> >> >Hi, >> > >> >Carsten Klee specified a simple path language for MARC records, called >> >"MARC spec". In short it is a formal syntax to refer to selected parts >> >of a MARC record (similar to XPath for XML): >> > >> >http://collidoscope.de/lld/marcspec-as-string.html >> >http://cklee.github.io/marc-spec/marc-spec.html#examples >> > >> >Similar languages have been invented before but not with a strict >> >specification, as far as I know. For instance the perl Catmandu::MARC >> >supports references to MARC fields: >> > >> >https://metacpan.org/pod/Catmandu::Fix::Inline::marc_map >> >https://metacpan.org/source/NICS/Catmandu-MARC- >> 0.103/lib/Catmandu/Fix/Inli >> >ne/marc_map.pm#L26 >> > >> >Could you please have a look at MARC spec and join forces to get a >> >common syntax that can be used among different tools? So >> > >> >- If your tool does not support all aspects of MARC spec, please >> >implement the missing parts. >> > >> >- If your tool supports more than included in MARC spec, help extending >> >the syntax at https://github.com/cKlee/marc-spec/ >> > >> >- If you tool uses a different syntax to refer to parts of MARC, >> >please think about modifying it to align with MARC spec. >> > >> >Cheers, >> >Jakob >> > >> >-- >> >Jakob VoÃ <jakob.voss@gbv.de> >> >Verbundzentrale des GBV (VZG) / Common Library Network >> >Platz der Goettinger Sieben 1, 37073 GÃ¶ttingen, Germany >> >+49 (0)551 39-10242, http://www.gbv.de/ >> > >> >_______________________________________________ >> >librecat-dev mailing list >> >librecat-dev@mail.librecat.org >> >http://mail.librecat.org/mailman/listinfo/librecat-dev > https://www.nntp.perl.org/group/perl.perl4lib/2014/01/msg3072.html Tue, 21 Jan 2014 08:56:21 +0000 A common MARC record path language (3 messages) Hi, Carsten Klee specified a simple path language for MARC records, called "MARC spec". In short it is a formal syntax to refer to selected parts of a MARC record (similar to XPath for XML): http://collidoscope.de/lld/marcspec-as-string.html http://cklee.github.io/marc-spec/marc-spec.html#examples Similar languages have been invented before but not with a strict specification, as far as I know. For instance the perl Catmandu::MARC supports references to MARC fields: https://metacpan.org/pod/Catmandu::Fix::Inline::marc_map https://metacpan.org/source/NICS/Catmandu-MARC-0.103/lib/Catmandu/Fix/Inline/marc_map.pm#L26 Could you please have a look at MARC spec and join forces to get a common syntax that can be used among different tools? So - If your tool does not support all aspects of MARC spec, please implement the missing parts. - If your tool supports more than included in MARC spec, help extending the syntax at https://github.com/cKlee/marc-spec/ - If you tool uses a different syntax to refer to parts of MARC, please think about modifying it to align with MARC spec. Cheers, Jakob -- Jakob Voß <jakob.voss@gbv.de> Verbundzentrale des GBV (VZG) / Common Library Network Platz der Goettinger Sieben 1, 37073 Göttingen, Germany +49 (0)551 39-10242, http://www.gbv.de/ https://www.nntp.perl.org/group/perl.perl4lib/2013/12/msg3069.html Thu, 19 Dec 2013 12:16:14 +0000 MARC::Record 2.0.6 (1 message) Hi, I have uploaded version 2.0.6 of MARC::Record to CPAN. This is a small functionality and bugfix release. Here are the changes since 2.0.5: [ENHANCEMENTS] - MARC::Field->as_string() now accepts an optional second parameter to specify the delimiter to use between subfields. (Tomas Cohen Arazi) - MARC::Field->delete_subfield() can now accept a regexp to specify the subfields to remove. For example, to remove all numeric subfields, one can say: $field->delete_subfield(code => qr/\d/); (Jason Stephenson) [FIXES] - the warnings pragma is now used throughout MARC::Record - $field->as_string('0') now returns the contents of subfield $0 rather than the contents of all of the subfields in the field. - RT#88421: add newline after printing warnings (Jason Stephenson) - RT#85804: fix spelling glitch (Gregor Herrmann) Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2013/10/msg3067.html Tue, 22 Oct 2013 16:53:00 +0000 Re: New perl module MARC::File::MiJ -- marc-in-json for perl (1 message) hello > It's currently supported across several implementations: > * ruby'sÂ marcÂ gem > * php's File_MARC > * java's marc4j > * python's pymarc just for the record: i was aware of MIJ when i wrote MARC::MIR. I just ignored it because of the problems i just mention on code4lib. yet, a converter would be easy: sub mir2mij (_) { my $mir = shift; { leader: $$mir[0] , fields: [ map { my ( $tag, $data ) = %$_; if ( ref $data ) { # data field [ $tag , [ map [%$_], @{$$data{subfields}} ] , [ @{$data}{qw< ind1 ind2 >} ] ] } else { # control field [ $tag, $data ] } } @{$$mir[1]} ] } } regards -- Marc Chantreux Université de Strasbourg, Direction Informatique 14 Rue René Descartes, 67084 STRASBOURG CEDEX ☎: 03.68.85.57.40 http://unistra.fr "Don't believe everything you read on the Internet" -- Abraham Lincoln https://www.nntp.perl.org/group/perl.perl4lib/2013/09/msg3066.html Tue, 24 Sep 2013 11:27:16 +0000 Perl module to transform XSL to JSON (2 messages) Greetings,  could you please suggest me a tool in order to transform an xsl file i have manged to get from XML, into JSON? Thank you https://www.nntp.perl.org/group/perl.perl4lib/2013/09/msg3064.html Mon, 23 Sep 2013 08:06:25 +0000 MARC::Charset 1.35 (1 message) Hi, I have uploaded version 1.35 of MARC::Charset to CPAN. This is a relatively significant bugfix release, particularly for folks who need to handle MARC-8 records containing extended Cyrillic and Arabic characters. Changes from 1.34 are: - improve conversion of certain composed characters to MARC8 Some characters should not be fully decomposed before converting them to MARC8. This patch adds a table of such characters, based on Annex A of http://www.loc.gov/marc/marbi/2006/2006-04.html and on some sample records provided by Jason Stephenson of MVLC. - recognize G0 and G1 characters properly When converting from MARC8 to UTF8, MARC::Charset now properly recognizes if a (single-byte) MARC8 character falls in G0 or G1. This is part of the fix for RT#63271 (converting characters in the Extended Cyrillic character set), but should also fix similar issues with converting characters in the extended Arabic set. This commit also means that all MARC8 character sets that support both G0 and G1 wll be properly converted, regardless of whether they're currently set as the G0 or G1 character set. For example, it is now possible to convert Extended Latin as G0 or Basic Latin as G1. This fixes RT#63271 - have MARC::Charset::Code->marc_value() handle G0/G1 conversion Since there's at present no need to do things like have ANSEL be the G0 character set when converting from UTF8 to MARC8, this commit centralizes the logic for deciding whether to return the G0 or G1 MARC8 representation of a character. Also add MARC::Charset::Code->g0_marc_value(), which returns the G0 representation of the character for use by the character DB. - New test cases for converting Vietnamese and Extended Cyrillic text. Regards, Galen -- Galen Charlton gmcharlt@gmail.com https://www.nntp.perl.org/group/perl.perl4lib/2013/08/msg3063.html Wed, 14 Aug 2013 03:06:54 +0000 Catmandu and MODS::Record (3 messages) Hi all LibreCat -=-=-=-= LibreCat is an open collaboration of the university libraries of Lund, Ghent, and Bielefeld to create tools for library and research services. One of the toolkits we provide is called 'Catmandu' (http://search.cpan.org/~nics/Catmandu-0.5004/lib/Catmandu.pm) which is a suite of tools to do ETL processing on library data. We provide tools to import data via JSON, YAML, CSV, MARC, SRU, OAI-PMH and more. To transform this data we created a small DSL language that librarians use in our institutions. Also we make it very easy to store the results in MongoDB, ElasticSearch, Solr or export it into various formats. We create also command line tools because we felt that in our daily jobs we were creating the same type of adhoc Perl scripts over and over for endless reports. E.g. to create a CSV file of all titles in a MARC export we say something like: $ catmandu convert MARC to CSV --fix 'marc_map("245","title"); retain_field("record");' < records.mrc To get all titles from our institutional repository we say: $ catmandu convert OAI --url http://biblio.ugent.be/oai to JSON --fix 'retain_field("title")' To store a MARC export into a MongoDB we do: $ catmandu import MARC to MongoDB --database_name mydb --bag data < records.mrc Here is a blog post about the commands that are available: http://librecat.org/catmandu/2013/06/21/catmandu-cheat-sheet.html See our project page for more information about LibreCat and Catmandu : http://librecat.org and a tutorial how to work with the API http://librecat.org/tutorial/ MODS::Record -=-=-=-=-=-= In one of our Catmandu projects we created a Perl connector for Fedora Commons (http://search.cpan.org/~hochsten/Catmandu-FedoraCommons-0.24). One of our goals was to integrate better with the Islandora project. For this we needed a Perl MODS parser. As there was no module available on CPAN we provide a top level module like MARC::Record called MODS::Record http://search.cpan.org/~hochsten/MODS-Record-0.05/lib/MODS/Record.pm. I hope this will be of some help for the community. If there are coders here who would like to contribute to the MODS package please drop me a line. I think CPAN MODS support shouldn't be dependent on one coder, one institution. Greetings from a sunny Belgium, Patrick https://www.nntp.perl.org/group/perl.perl4lib/2013/08/msg3060.html Tue, 06 Aug 2013 07:04:10 +0000 New perl module MARC::File::MiJ -- marc-in-json for perl (1 message) The marc-in-json<http://dilettantes.code4lib.org/blog/2010/09/a-proposal-to-serialize-marc-in-json/>format is, as you might expect, a JSON serialization for MARC. A JSON serialization for MARC is potentially useful in the same places where MARC-XML would be useful (long records, utility of human-readable records, etc.) without what many perceive to be the relative pain of working with XML vs JSON. It's currently supported across several implementations: - ruby's *marc* gem - php's *File_MARC* - java's *marc4j* - python's *pymarc* There wasn't one for perl, so I wrote one :-) MARC::File::MiJ<http://search.cpan.org/~gmcharlt/MARC-File-MiJ-0.01/lib/MARC/File/MiJ.pm>is a perl module that allows MARC::Record to encode/decode marc-in-json. It also supplies a handler to MARC::File/MARC::Batch that will read marc-in-json records from a newline-delimited-json (ndj) file (where each line is a JSON object without unescaped newlines, ending with a newline). marc-in-json encoding/decoding tends to be pretty fast<http://robotlibrarian.billdueber.com/sizespeed-of-various-marc-serializations-using-ruby-marc/>, since json parsers tend to be pretty fast, and uncompressed filesizes occupy a middle-ground between binary marc and marc-xml. A sample file of about 18k marc records looks like this: 31M topics.mrc 56M topics.ndj (newline-delimited JSON) 93M topics.xml 8.9M topics.mrc.gz 7.9M topics.ndj.gz 8.7M topics.xml.gz ...so obviously it compresses pretty well, too. I can take generic questions; bugs should go to https://rt.cpan.org/Public/Bug/Report.html?Queue=MARC-File-MiJ [ Note that there are many other possible JSON serializations for MARC<http://jakoblog.de/2011/04/13/mapping-bibliographic-record-subfields-to-json/>, including the (incompatible) one implemented in the MARC::File::JSON<http://search.cpan.org/~cfouts/MARC-File-JSON-0.002/lib/MARC/File/JSON.pm>module] -- Bill Dueber Library Systems Programmer University of Michigan Library https://www.nntp.perl.org/group/perl.perl4lib/2013/07/msg3059.html Mon, 15 Jul 2013 22:52:46 +0000