<span xmlns="http://www.w3.org/1999/xhtml" class="application">Zebra</span> - User's Guide and Reference1

Zebra - User's Guide and Reference

Adam Dickmeiss

Heikki Levanto

Marc Cromme

Mike Taylor

Sebastian Hammer

2.0.38

Abstract

Zebra is a free, fast, friendly information management system. It can index records in XML, SGML, MARC, e-mail archives and many other formats, and quickly find them using a combination of boolean searching and relevance ranking. Search-and-retrieve applications can be written using APIs in a wide variety of languages, communicating with the Zebra server using industry-standard information-retrieval protocols or web services.

This manual explains how to build and install Zebra, configure it appropriately for your application, add data and set up a running information service. It describes version 2.0.38 of Zebra.


Table of Contents

1. Introduction
Overview
Zebra Features Overview
Zebra Document Model
Zebra Search Features
Zebra Index Scanning
Zebra Document Presentation
Zebra Sorting and Ranking
Zebra Live Updates
Zebra Networked Protocols
Zebra Data Size and Scalability
Zebra Supported Platforms
References and Zebra based Applications
Koha free open-source ILS
Kete Open Source Digital Library and Archiving software
Emilda open source ILS
ReIndex.Net web based ILS
DADS - the DTV Article Database Service
Infonet Eprints
Alvis
ULS (Union List of Serials)
NLI-Z39.50 - a Natural Language Interface for Libraries
Various web indexes
Support
2. Installation
UNIX
GNU/Debian
GNU/Debian Linux on i686 Platform
Ubuntu/Debian and GNU/Debian on other platforms
WIN32
Upgrading from Zebra version 1.3.x
3. Tutorial
A first OAI indexing example
Searching the OAI database by web service
Presenting search results in different formats
More interesting searches
Investigating the content of the indexes
Setting up a correct SRU web service
Searching the OAI database by Z39.50 protocol
4. Overview of Zebra Architecture
Local Representation
Main Components
Core Zebra Libraries Containing Common Functionality
Zebra Indexer
Zebra Searcher/Retriever
YAZ Server Frontend
Record Models and Filter Modules
Indexing and Retrieval Workflow
Retrieval of Zebra internal record data
5. Query Model
Query Model Overview
Query Languages
Operation types
RPN queries and semantics
RPN tree structure
Explain Attribute Set
BIB-1 Attribute Set
Zebra general Bib1 Non-Use Attributes (type 2-6)
Extended Zebra RPN Features
Zebra specific retrieval of all records
Zebra specific Search Extensions to all Attribute Sets
Zebra specific Scan Extensions to all Attribute Sets
Zebra special IDXPATH Attribute Set for GRS-1 indexing
Mapping from PQF atomic APT queries to Zebra internal register indexes
Zebra Regular Expressions in Truncation Attribute (type = 5)
Server Side CQL to PQF Query Translation
6. Administrating Zebra
Record Types
The Zebra Configuration File
Locating Records
Indexing with no Record IDs (Simple Indexing)
Indexing with File Record IDs
Indexing with General Record IDs
Register Location
Safe Updating - Using Shadow Registers
Description
How to Use Shadow Register Files
Relevance Ranking and Sorting of Result Sets
Overview
Static Ranking
Dynamic Ranking
Sorting
Extended Services: Remote Insert, Update and Delete
Extended services in the Z39.50 protocol
Extended services from yaz-client
Extended services from yaz-php
Extended services debugging guide
7. DOM XML Record Model and Filter Module
DOM Record Filter Architecture
DOM XML filter pipeline configuration
Input pipeline
Extract pipeline
Store pipeline
Retrieve pipeline
Canonical Indexing Format
DOM Record Model Configuration
DOM Indexing Configuration
DOM Indexing MARCXML
DOM Indexing Wizardry
Debuggig DOM Filter Configurations
8. ALVIS XML Record Model and Filter Module
ALVIS Record Filter
ALVIS Internal Record Representation
ALVIS Canonical Indexing Format
ALVIS Record Model Configuration
ALVIS Indexing Configuration
ALVIS Exchange Formats
ALVIS Filter OAI Indexing Example
9. GRS-1 Record Model and Filter Modules
GRS-1 Record Filters
GRS-1 Canonical Input Format
GRS-1 REGX And TCL Input Filters
GRS-1 Internal Record Representation
Tagged Elements
Variants
Data Elements
GRS-1 Record Model Configuration
The Abstract Syntax
The Configuration Files
The Abstract Syntax (.abs) Files
The Attribute Set (.att) Files
The Tag Set (.tag) Files
The Variant Set (.var) Files
The Element Set (.est) Files
The Schema Mapping (.map) Files
The MARC (ISO2709) Representation (.mar) Files
GRS-1 Exchange Formats
Extended indexing of MARC records
The index-formula
Notation of index-formula for Zebra
10. Field Structure and Character Sets
The default.idx file
Charmap Files
ICU Chain Files
I. Reference
zebraidxZebra Administrative Tool
zebrasrv — Zebra Server
idzebra-config — Script to get information about idzebra
A. License
GNU General Public License
B. About Index Data and the Zebra Server

List of Figures

7.1. DOM XML filter architecture