Systems Planning Homepage MWeb Homepage MWeb Administrator's Guide - How MWeb Works - Getting Started 1. Configuring the Project - Integrated Searching - Project Setup - Changing Subsets - Changing Search Categories - Changing Stopwords - Adding Additional Users - Linking to MWeb - Moving or Deleting a Project 2. Configuring Databases - Types of Databases - Adding Databases - Configuring Features - Configuring Tables - Configuring Subsets - Configuring Fields - Indexing a Database - Maintaining the Index - Testing the Database - If a Database Changes - Deleting a Database - Databases with Images, Part 1 - Databases with Images, Part 2 - Advanced Relational Topics - More about MARC Databases 3. Configuring the Interface - Introduction - Customizing the Splash Screen - Changing Messages - Changing Layouts - Changing CSS Stylesheets - Using XML and XSLT 4. Using MWeb Features - Sorting Search Results - Direct Access 5. Administering MWeb - Basic Tasks - Keeping Records - Administrator Control Center - User Administration - Reporting Problems - Performance Appendix 1: Express Setup Appendix 2: Standard Setup Appendix 3: Reference - Interface Options - Security - Browsers and Standards - Error Messages - Character Encodings - Installation Details - Uninstalling & Reinstalling Appendix 4: Troubleshooting MWeb Glossary Contact Us

MWeb™ Universal Administrator's Guide


2. Configuring Databases

Database Types

The concept of a Database Type is central to MWeb Universal. At present MWeb Universal offers (or will offer) these Database Types. They are color-coded in this documentation to help you find relevant information quickly.

  • MARC Databases
  • MWeb Enterprise Databases
  • PastPerfect Databases on the PP Server (running on the PastPerfect server)
  • Self-Hosted PastPerfect Databases (running on your server)
  • Relational Databases

MARC Databases (in release 1.2 and later)

What MWeb Universal calls a MARC Database is not a true database, but simply one or more MARC files (MARC is the international standard bibliographic format used by libraries). Release 1.2 supports MARC21 bibliographic and authority formats, in MARC-8 or UTF-8. If the MARC Database consists of more than one file, all files must be in the same format (bibliographic or authority), and in the same encoding (MARC-8 or UTF-8).

Later releases will allow multiple formats and encodings to be searched simultaneously. In addition, we plan to extend support to MARCXML and UNIMARC formats.

There is very little configuration for you to do for MARC files, since MWeb has wizards to do this. See More About MARC Databases below for more information.

At present, MARC files must adhere to the standard strictly. For example, carriage returns and line-feeds are not allowed between records.

MWeb Enterprise Databases (in future release 2.0)

The MWeb Universal setup wizard will configure a MWeb Enterprise Database using the configuration in MWeb Enterprise. The automatic configuration transfers from Enterprise to Universal the following: what fields are indexed, the assignment of fields to Search Categories, and the display-names of fields. Some configuration options can be changed in MWeb Universal after the wizard runs.

PastPerfect Databases (in release 1.1 and later)

Our PastPerfect-Online product is the easiest way to provide online access to your PastPerfect data. This solution includes hosting provided by PastPerfect Software, Inc. However, if you wish to host your online database yourself, or to provide integrated searching of your database with other databases, you need MWeb Universal.

PastPerfect clients have three ways to use MWeb Universal:

  1. Federated searching of multiple PastPerfect Databases on the PP Server uses the databases on the PastPerfect server. For this you would order -- for each database -- the PastPerfect-Online module with hosting from PastPerfect Software, Inc.. Then order MWeb Universal from Systems Planning to provide simultaneous access to that database and your others. Your online catalogs on the PastPerfect server will continue to function normally.
  2. A Self-Hosted PastPerfect Database uses the add-on module for the PastPerfect collections management system; this module includes a feature to export records. This exports them into files that can be used with MWeb Universal without using the PastPerfect hosting service. The add-on module also exports your desired configuration for the MWeb Universal Setup Wizard. For this solution, buy the PastPerfect-Online module without hosting from PastPerfect Software, Inc. and MWeb Universal from Systems Planning.
  3. MWeb Universal 2.0 will provide access directly to the database in the PastPerfect collections management system. This is a Relational Database (see below), so it is much harder than options 1 or 2 because you will have to learn the database structure and then do all the mapping yourself. It is not recommended unless the Project must provide access to real-time data instead of periodic exports.

Relational Databases (in future release 2.0)

A Relational Database is called that because the various data tables relate to each other. In other words, to retrieve a complete record, data from several tables might be required. A typical example would be to retrieve artwork information from one table, and the artist information from another. MWeb cannot understand the relationships without your help. If you have a Database like this, you will have to write SQL statements to explain to MWeb how to construct a complete record. This is covered in detail under Advanced Relational Topics below.

MWeb will function completely without these SQL statements, so you can write them later if you wish. In the meantime, Search Results and Full Records will show fields from a single table at a time, which may be adequate for your needs.

In addition, tables with images require SQL to be written to find the image information.

Adding Databases

To tell MWeb Universal about the databases to search, follow these instructions for each database. Don't worry too much, as all the settings can be changed at any time.

These instructions assume you have already installed your Database and the MWeb Universal Database Connector onto the Database Server. If not, please do so using the instructions in Appendix 1: Installation, then return here.

MWeb automatically generates your Project's Subsets from those in the first Database added. Therefore the first Database added should be the most comprehensive in content. This is easier than modifying Subsets later.

  1. Ensure that no one is performing maintenance on the Database. It must be available through ODBC, which means it must not be opened exclusively by some other user. Unless MWeb can access the Database it will make wrong decisions.
  2. Click the Logon button in the MWeb Main Menu and enter your Administrator's user ID and password
  3. Click the Databases button, select the Database Type (see box below) to add, then click the Add New Database button. You will see the Add Database form.

Database Type

Selecting the Database Type activates the appropriate wizard. Wizards save you the trouble of telling MWeb the structure of your Database. We add a wizard whenever we learn the structure of a widely used Database. Here are the wizards currently available:

  • MARC Database -- A Database consisting of MARC21 files.
  • Self-Hosted PastPerfect Database -- The files exported from the PastPerfect-Online add-on module in the PastPerfect collections management system. These files let you run PastPerfect-Online on your own server.
  • PastPerfect Database on the PP Server -- A PastPerfect-Online Database hosted on the PastPerfect server.

Release 2.0 will include a wizard for

  • MWeb Enterprise Database -- An existing MWeb Enterprise Database.

We will update this list as we develop more wizards.


The "Add Database" form

When the Database is a MARC file

What MWeb Universal calls a MARC Database is just one or more MARC21 bibliographic or authority files in the same folder; MARCXML and UNIMARC will be added in future releases.

Code, Name, Domain, and CGI Path are the same as described at left.

Data-Source Name is a full path on the Database Server pointing to the MARC file. The name of the MARC file may include wild-card characters. Examples:

d:\library\catalog.mrc
d:\library\snapshot*.mrc
d:\library\*.*

If there are wildcards in the Data-Source Name, we call that a "file-set", otherwise a "file".

The wizard requires all files to be in the same folder, in the same format (bibliographic or authority), and in the same encoding (MARC-8 or UTF-8). However, after the wizard has run you may add files in other folders to the Database, as long has they have the same format and encoding.

Release 2.0 of MWeb Universal will allow multiple formats and encodings to be searched simultaneously. In addition, we plan to extend support to MARCXML and UNIMARC formats.

Fill in the requested information:

  • Code is a brief name the users will understand when they see it in the Search Results. It need not be the same as the Database Code you used during installation.

    The Search Results are sorted alphabetically by these Database Codes.

  • Name is a longer name for the Database for your reference
  • Domain is the domain of the Database Server, such as "www.myhost.org" (do not add "http://"). Do not use "localhost" or "127.0.0.1" as the domain; if you do, your Database will not be seen from outside.
  • CGI Path is the virtual directory to the CGI Directory containing the Database Connector (referred to as "udccgi" in the installation instructions). Do not use slashes here.
  • Data-Source Name is the ODBC Data-Source Name for the Database on the Database Server.

When all fields are filled in, click the Add button to add the Database.

Next Steps

If there is a wizard for your Database Type, MWeb will configure the Database. If there is no wizard, or if the wizard cannot do the complete configuration, MWeb will guide you through the steps below.


Configuring Features

MARC Databases -- read this section if your character encoding is not MARC-8
MWeb Enterprise Databases -- handled by wizard
Self-Hosted PastPerfect Databases -- handled by wizard
PastPerfect Databases on the PP Server -- handled by wizard
Relational Databases -- read this section

When you "configure" a Database, whether for the first time or later, you will proceed through a series of steps. To move to the next step, use the buttons at the bottom of the display. There is also a quick menu in the upper right of each display to return to steps you have completed.

The Configure Features display shows certain global characteristics of your Database. Click the Edit button to make any required changes. Click the Next Step button when finished.

Here are the data elements you will see in this configuration step:

ID The unique ID of this Database (system generated)
Database Type The Database Type is one of:
MARC21AUTH MARC Database (one or more MARC21 authority files with the same encoding)
MARC21BIBMARC Database (one or more MARC21 bibliographic files with the same encoding)
MWEBENTERPRISE*MWeb Enterprise Database
PPOEXPORTSelf-Hosted PastPerfect Database
PPOHOSTEDPastPerfect Database on the PP Server
RELATIONAL*Relational Database
* indicates a type that will be supported in Release 2.0. All others are currently supported
This field stores the Database Type you chose when you added the Database. To change it, click the Edit button and select the correct Database Type from the dropdown list.

Unfortunately, changing this field will not run the configuration wizard for the new Database Type; instead, you may want to delete the Database and re-add it with the new Database Type.
Encoding This field records how special ("non-ASCII") characters are encoded:
LATIN-1Latin-1 encoding
MARC-8The library MARC-8 encoding
NCRUnicode encoded as decimal Numerical Character References (such as {)
UTF-8Unicode encoded as UTF-8
If MWeb guessed wrong, click the Edit button and correct it. Use the punctuation shown here.
Data-Source Name This is the ODBC Data-Source Name for the Database that was set up during installation.
Index DBMS* You have the option of using the MySQL database system to store MWeb's index (rather than the default SQLite database system). This may provide faster indexing and retrieval for large Databases. If you wish to use MySQL, install it and create a MySQL database using the instructions in Express Setup. Then into this field enter "MySQL" (without quotation marks). Leave this field blank to use the default SQLite. You will have to reindex your MWeb Database after changing this field.
Index Name If you entered "MySQL" into the Index DBMS field, enter into this field the name of the MySQL database you created. You will have to reindex your MWeb Database after changing this field.
Index Key Length If you entered "MySQL" into the Index DBMS field, enter into this field the length of the keys to be used in the MySQL index. The smaller the key length, the faster MWeb will run. Make the default of 40 smaller if your data will allow it, larger if necessary. The key length should be the length of the unique identifiers in your Database, plus 4. You will have to reindex your MWeb Database after changing this field.

* We have found no significant difference in speed of indexing or retrieval between MySQL and SQLite. Our largest test so far has been 1.6 million records (80 million keywords, 6 GB filesize). These indexes took about 11 hours to build.

Configuring Tables

Initial setup

MARC Databases -- handled by wizard
MWeb Enterprise Databases -- handled by wizard
Self-Hosted PastPerfect Databases -- handled by wizard
PastPerfect Databases on the PP Server -- handled by wizard
Relational Databases -- read this section

This display shows the tables or files the setup wizard found in the Database.

If your Relational Database is in FoxPro, see The wizard did not add tables.

Review the Use This Table column. This controls both searching and display. If it is checked, the table may be indexed and its data may be found by keyword and phrase searches. If this is not checked, the table will not be not indexed, its values cannot be used in searches, nor will it be displayed. Generally you should checkmark all tables with significant content, excluding tables with obscure data or codes that no one would think of using in a search.

If the table has Images, click the Edit button for the table and fill in the three fields relating to images. Instructions are below, under Databases with Images.

Make sure the column Primary Key has the correct field or fields that comprise the primary key for the table. If more than one field is used to make the primary key, use a plus sign between the fields, like this: issueid+commentid.

Check the Numeric IDs column to see if MWeb has guessed correctly. If the Primary Key is made from more than one field, and the fields are not all numeric or non-numeric, then the Numeric IDs should correspond to the Primary Key using plus signs, as shown here:

Primary Key Type of keys Numeric IDs
zipcode+areacode Both numeric Y or Y+Y
lastname+firstname Both non-numeric N or N+N
zipcode+name First numeric, second non-numeric Y+N
city+zipcode First non-numeric, second numeric N+Y
lastname+firstname+zipcode First two non-numeric, third numeric N+N+Y

For Relational Databases, certain records may require fields to be gathered from more than one database table. In this case you will need to fill in the Full Record Query column. This is discussed in detail below under Advanced Relational Topics.

Adding a table or file

To add a new table or file, click the button above the list of tables. You will be asked to provide this information:

For MARC Databases you may add files in any location, with any extension. For example, if your original Database was c:\marc\*.mrc, you may add a file c:\newfiles\*.xxx. However, the new files must have the same format (bibliographic or authority) and the same encoding (MARC-8 or UTF-8).

ID The unique ID of this table in MWeb (system generated)
Table Name The internal name of the table in the Database. For MARC files, the complete path, filename, and extension.
Display Name The name of the table as displayed. Search Results are sorted by this Display Name.
Image Query See Databases with Images below
Thumbnail Pattern See Databases with Images below
Image Pattern See Databases with Images below
Use This Table If checked, indicates that you want this table to be used in the Project. If unchecked, overrides all other codes. Therefore you can be sure that if this is unchecked, this table will never be indexed or displayed in the Project.

For control of specific fields, there is a Use This Field checkbox you will see in a later step. Both the table and the field checkboxes must be checked in order for a field to be indexed and displayed.

Editing a table or file

MWeb Enterprise Databases and PastPerfect Databases on the PP Server will not permit editing, since this would also affect the other users of that database. Instead, make configuration changes to the original database.

Click on the Edit button to change information about a table or file. More information is shown for editing that was shown for adding a table:

ID The unique ID of this table in MWeb (system generated)
Table Name The brief name of the table
Display Name The name of the table as displayed
Primary Key The unique identifier of the records in this table in your Database

For MARC Databases and MWeb Enterprise Databases leave this field empty.
Numeric IDs Whether the primary keys of the records in this table are numeric

For MARC Databases and MWeb Enterprise Databases leave this field empty.
Full Record Query For MARC Databases, this field may contain a comma-separated list of tags that overrides the values of Display in Full Record in the Fields list (see Configuring Fields below); this allows you to display different fields for different MARC files.

For MWeb Enterprise Databases leave this field empty.

For Relational Databases, the SQL (programming code) used to display the Full Records for this Subset (see Advanced Database Configuration)
Image Query See Databases with Images below
Thumbnail Pattern See Databases with Images below
Image Pattern See Databases with Images below
Use This Table If checked, indicates that you want this table to be used in the Project. If unchecked, overrides all other codes. Therefore you can be sure that if this is unchecked, this table will never be indexed or displayed in the Project.

For control of specific fields, there is a Use This Field checkbox you will see in a later step. Both the table and the field checkboxes must be checked in order for a field to be indexed and displayed.

Configuring Subsets

MARC Databases -- no subsets are setup by wizard
MWeb Enterprise Databases -- initial configuration handled by wizard
Self-Hosted PastPerfect Databases -- initial configuration handled by wizard
PastPerfect Databases on the PP Server -- initial configuration handled by wizard
Relational Databases -- read this section

This display shows the Subsets for the Database. A Subset is a distinct type of content in your Database, such as museum objects, library records, artists, places, images, or media. For MARC Databases Subsets can be types of records, such as juvenilia, local history, rare books, or reference. Click the Edit button to change the Subset mappings.

A record can belong to only one Subset, so try to map Subsets to minimize overlapping definitions.

MWeb Enterprise Databases and PastPerfect Databases on the PP Server will not permit editing, since this would also affect the other users of that database. Instead, make configuration changes to the original database.

Here are the data elements you will see in this configuration step:

ID The unique ID of this Subset (system generated)
Subset Brief name of the Subset. This is displayed to the searcher in Keyword Search and Advanced Search displays.
Map Name Not currently used.
Mapping For MARC Databases, a SQL-like "where" clause (see below).

For MWeb Enterprise Databases and PastPerfect Databases on the PP Server, this is the Subset Number used in those systems.

For Relational Databases, the Tablename. It must be identical to the Tablename column in the previous step (Configure Tables).

Details about mapping MARC Databases

When mapping, remember that a MARC Database record can be in only one Subset. If you specify overlapping or duplicate Subset mappings, MWeb will use the first one found during indexing.

To map Subsets in a MARC Database, provide a SQL "where" clause in the Subset mapping field. For example, 260$c like 'New York'. This release of MWeb has the following syntax. We will add more flexibility in later releases (please let us know your needs!).

  • Only a single term (no ANDs or ORs).
  • Only a single subfield code (or none). Prefix the subfield code with a dollar sign.
  • Use x to indicate a wild-card digit in a tag (6xx like 'United States')
  • Operators are =, <>, like, >, <, >=, and <=.
  • Use single quotation marks around values to be searched, unless they are numbers. (Compare 260$a='New York' with LEN>3000).
  • You may use the following pseudo-tags to specify Leader bytes (or you may use the byte number or range): LDR (Leader), LEN (Record Length), STA (Record Status), TYP (Type of Record), LEV (Bibliographic Level), HIE (Hierarchical Level in UNIMARC records (future)), CTL (Type of Control), CHR (Character Coding Scheme), ENC (Encoding Level), CAT (Descriptive Cataloging Form), REL (Linked Record Requirement).
  • Use a slash to introduce a byte or byte-range of the Leader or Fixed Fields (LDR/6 or 008/0-5); for example 008/0-5=861231. Remember that these start with byte 0.
  • Use * to indicate the presence or absence of a field or subfield (6xx=* or 245<>*). Do not use any operator other than = or <> with *. Single quotation marks around the * are allowed by not needed.
  • To include all records in a Subset, use 008=* as the mapping (since every MARC21 record has a 008).
  • Tags, subfield codes, and values are all case-sensitive, except that if the operator is "like", the matching is NOT case-sensitive (in addition to the normal meaning of "like"). See next item for example.
  • The percent sign or other truncation symbol is not used with the "like" operator. MWeb searches as if your term had a % at both ends. For example, 1xx like 'john' will find records with 100 fields like "$aJones, John D.$d1955-" or "$aJones, J.D.$d1955-$q(John Dewey)".
  • Do not use ? or _ to indicate a single wild-card character.

Complex Subset mapping in MARC Databases will slow down indexing. However, retrieval speed will not be affected.

Details about mapping Relational Databases

When mapping, remember that a Relational Database table can be in only one Subset. If you specify overlapping or duplicate Subset mappings, MWeb will use the first one found during indexing.

To map Subsets in a Relational Database, simply type into the Subset mapping field a list of table names in the Database. Table names may be separated by spaces, commas, or both. Use the exact name shown in the Database's list of tables. Use all uppercase letters.

Configuring Fields

MARC Databases -- read this section
MWeb Enterprise Databases -- handled by wizard
Self-Hosted PastPerfect Databases -- handled by wizard
PastPerfect Databases on the PP Server -- handled by wizard
Relational Databases -- read this section

This display shows the fields MWeb found in the Database.

MARC Databases are configured with the full set of official MARC tags. However, local-use tags are not discovered until the first time the Database is indexed. Every local-use field is indexed; their Display Names are set to "Tag xxx" and their Search Categories are set to "Remarks". After the first indexing, you may wish to suppress indexing of some local-use tags, or change their Display Names, or change their Remarks. Your changes will take place the next time you index the Database.

Display Names field

Names in the Display Name column are used in Search Results and Full Record displays. These were generated automatically, so you may wish to improve them; if so, click the Edit button for that row.

"Use This Field" checkbox

The Use This Field column controls both searching and display. If it is not checked, the field will be totally excluded from your Project. It will not be indexed and will not be displayed anywhere. In addition, the field will not be used if it is displayed in a gray font; this means this field's data table is unchecked in the Configure Tables display.

For a field to be indexed, its table's Use This Table checkbox must be checked, and the field must have its Use This Field checkbox checked, plus it must have a non-zero value for its Search Category. Generally you should have MWeb index all fields with significant content, excluding fields with obscure data or codes that no one would think of using in a search.

For a field to be displayed, its table's Use This Table checkbox must be checked, and the field must have its Use This Field checkbox checked, plus it must have a non-zero value in one of its three "Sequence" columns described below.

Search Categories field

This column is shows what Search Category the field belongs to. During the initial configuration, the Search Categories Wizard maps Search Categories based on the fieldnames. Since this is based only on words, it may not be very accurate so you should review this.

Furthermore, if the Wizard cannot map a fieldname, it assigns the field to the "Remarks" Search Category. This ensures that every field will be indexed, but many fields may end up being indexed as Remarks incorrectly.

To change a field's Search Category, click the Edit button for the row and select a Search Category from the dropdown list. Select the none option if you do not want a field to be indexed (this will set its Search Category to 0).

You may wish to just print the list of fields if you have not yet worked out the Search Categories. After you have decided on Search Categories you can use the Search Categories Wizard to map them to all Databases in one operation. Then you can return to this display to make any corrections needed. See Project Setup for more on Search Categories.

Sequence fields

The three "Sequence" columns show the order that fields are displayed. The Data Display Sequence field controls the Search Results data layout, the Thumbnail Display Sequence field controls the Search Results thumbnail layout, and the Full Record Sequence* field controls the Full Record layout. Zero means the field is not shown in that display. If no fields have "Sequence" numbers, your displays will be empty. We recommend showing up to six fields in Data Displays, two or three in Thumbnail Displays, and all fields or most fields in Full Records.

* For MARC Databases, the Full Record Sequence column is named Display in Full Record, because the fields are always displayed in the order they are in the record; use 1 in Display in Full Record to display the field, 0 if the field should not be shown.

For MARC Databases, for Data Display Sequence and Thumbnail Display Sequence, the first three subfields are shown when you request a field to be displayed.

The numbers in the three "Sequence" columns need not be continuous -- you may skip numbers if you wish. Do not use negative numbers. Use zero if a field is not to be displayed.

When finished

If this is your initial configuration for this Database, click the Search Categories Wizard button when finished; however, this will overwrite any mapping of Search Categories you may have done by hand. To preserve the existing mapping, click the Apply Changes button.

Here are the data elements you will see in this configuration step:

ID The unique ID of this field in MWeb (system generated)
Table Name The brief name of the table this field belongs to
Field Name The brief name of this field
Display Name The name of the field as displayed
Search Category The MWeb Search Category this field belongs to (see below).
Data Display Sequence The sequence in which this field displays in Search Results data displays. Use a zero if the field is not to be shown in the Search Results data display.

For MWeb Enterprise Databases leave this field empty.
Thumbnail Display Sequence The sequence in which this field displays in Search Results thumbnail displays. Use a zero if the field is not to be shown in the Search Results thumbnail display.

For MWeb Enterprise Databases leave this field empty.
Full Record Sequence

or

Display in Full Record
The sequence in which this field displays in Full Records (used when there is no Full Record Query here). Use a zero if the field is not to be shown in the Full Record.

For MARC Databases, use a 1 if the field is to be displayed, zero if not. Fields display in the order they appear in the record.

For MWeb Enterprise Databases leave this field empty.
Use This Field If checked, indicates that you want this field to be used in the Project. If unchecked, this overrides all other codes. Therefore you can be sure that if this checkbox is unchecked, the field will not be indexed and will never be displayed.

If this field is checked, but the Use This Table field for the table is unchecked, the field will not be used. In other words, both the table and the field checkboxes must be checked in order for a field to be indexed and displayed.

If you are configuring the Database for the first time, go back to the Administrator Control Center now. From there you can add additional Databases. When all Databases are added, they must be indexed (next section).

Indexing a Database

MWeb creates an index for most Databases on the Database Servers. (Absolutely no changes are made to your Databases: the index is created in MWeb's Control Tables.)

To index a Database, click the Databases button from the Administrator Control Center, then click the Index button for the Database. Because indexing can be a long process, a status report is displayed in your browser, updated every 5 seconds. The report will tell you when indexing is complete.

The time required to index the Database is related to its size. Here are some timings for several Databases on the same server. This is a fairly powerful server with a dual-core processor (which MWebIndex can make use of) and 4 GB of main memory.
RecordsKeywordsTime
499,2428,750,26053 minutes
866,39644,178,9533 hours 40 minutes
1,605,66095,969,97514 hours
So you can see that MWeb can handle large Databases, but the time to index them can be significant. However, search performance is not affected by size; searches in all these Databases proceed at the same speed. Please see the Troubleshooting Guide for ways to speed up indexing.

After indexing has been done for the each Database, you should be able to perform a search. The Keyword Search and Advanced Search buttons in the Main Menu should also be clickable; if they are not, click the Recheck Project Status button in the Database List. You may now either search, or click the Admin button to add more Databases.

When all Databases are indexed, your Project is complete and ready for use!

You may wish to save time by indexing more than one Database at a time. We do not recommend or support this, but it may work if the Databases are on different servers; the indexing report will probably be meaningless, however. It will not work with two Databases on the same server.

Optional use of MySQL

Starting with Release 1.2, you may use the MySQL 5.0 database system to store MWeb's internal index (instead of the default SQLite database system). This may provide faster indexing and retrieval for large Databases. If you want to use MySQL, follow these steps now:

  1. If the server does not have MySQL 5.0, download and install it now. (We cannot distribute it without charging you for a license.) Beta sites: We have not yet tested MWeb with MySQL 5.1. If you have -- or wish to try -- MySQL 5.1, we would be happy to hear of the results.
  2. Create a default user. To do this, use the Mysql program from a command prompt and issue these three instructions:
         create user ODBC;
         grant all privileges on *.* to 'ODBC'@'%';
         flush privileges;
    
  3. Create a database. To do this, use the Mysql program from a command prompt and issue the instruction
         create database MWEB;
    
    (or choose your own name).
  4. In MWeb Universal, return to the Administrator Control Center, click the Databases button, then the Configure button, then the Edit button.
  5. Enter "MySQL" into the Index DBMS field.
  6. Enter the name of the MySQL database you just created into the Index Name field.
  7. The smaller the Index Key Length, the faster MWeb will run. Make the default of 40 smaller if your data will allow it, larger if necessary. The key length is the length of the unique identifiers in your Database, plus 4.
  8. Click Save.

When you index your Database, the index will be created in MySQL. If you wish to revert to MWeb's default of using SQLite, restore the MySQL Database setting to its default (blank); you will have to reindex your MWeb Database after that.

We have found no significant difference in speed of indexing or retrieval between MySQL and SQLite. Our largest test so far has been 1.6 million records (80 million keywords, 6 GB filesize). These indexes took about 8 hours to build.

Maintaining the Index

As the data in the Database changes, the index will become out of date. To reindex, follow the same process as in the previous section, Indexing a Database.

You should also reindex all Databases if you add, change, or delete any of the Project's Subsets and/or Search Categories. However, you may change the Sequences and Names of these without having to reindex. You may also add, change, or delete Stopwords without having to reindex.

Testing the Database

If you have performed all the preceding steps, your Project and Databases should be installed and configured correctly. To test this, click the Keyword Search button and perform a search. If the Search Results and Full Record displays are correct, MWeb is ready for use.


If a Database Changes

If changes are made to the structure of the Databases that MWeb is searching, these changes may have to be recorded in MWeb's configuration. To do this, follow these steps:

  1. Ensure that no one is performing maintenance on the Database. It must be available through ODBC, which means it must not be opened exclusively by some other user. Unless MWeb can access the Database it will make wrong decisions.
  2. Click the Logon button in the MWeb Main Menu and enter your Administrator ID and password
  3. Click the Databases button. You will see a list of the Databases currently in the Project.
  4. Click Edit to edit the data you see in the Database listing. Make sure this data is correct before you attempt to use the Configure button.
  5. Click Configure to edit the details of the Database. The Configure button will go through the steps described above for configuring Features, Tables, Subsets, and Fields.

Starting over

We have not provided a function to reanalyze a Database that MWeb already knows about, since this would destroy all the configuration settings you have already. It is faster just to make changes. If you truly wish to start over, delete the Database (see next section), then re-add it.

Deleting a Database

You may delete a Database if it is no longer required in the Project. Deleting makes no changes on the Database server, either to the your Database or to the MWeb Control Tables. This means you can add the Database back to the Project and it will still be configured.

  1. Ensure that no one is performing maintenance on the Database. It must be available through ODBC, which means it must not be opened exclusively by some other user. Unless MWeb can access the Database it will make wrong decisions.
  2. Click the Logon button in the MWeb Main Menu and enter your Administrator ID and password
  3. Click the Databases button. You will see a list of the Databases currently in the Project.
  4. Click Delete to remove the Database from your Project.

Although a deleted Database will no longer be searched, its MWeb Control Tables remain on the Database Server; if you ever wish to re-add the Database, you will not have to reconfigure it unless its structure has changed.

However, if you wish to delete the MWeb Control Tables, they are in a file named MWebXML*.dat, where the * stands for a number. This file is located on the Database Server in the Data Directory you specified during installation (the default being UDCdata).

Caution: If there is more than one Database on the server, each will have its own MWebXML*.dat file. Be sure you delete the correct one. If you accidently delete this file, use the Windows Recycle Bin on the Database Server to restore it.

If you are deleting the last or only Database in the project, and plan to use the Project with a new Database, we recommend that you delete the list of Subsets in the Project before adding the new Database, if they will no longer be pertinent. This will allow the Wizard to add the correct Subsets when you add another Database. (MWeb cannot do this automatically as the Subsets may contain valuable information.)

Databases with Images, Part 1 -- Full URLs to Images

MARC Databases -- read this section
MWeb Enterprise Databases -- handled by wizard
Self-Hosted PastPerfect Databases -- skip this section
PastPerfect Databases on the PP Server -- skip this section
Relational Databases -- read this section

Any table or Subset in the Database may have images associated with it. Images may reside on the same server as the Database, or on any server connected to the Internet. These image formats are supported: JPEG, GIF, and PNG.

Read this section if you store references to images as complete URLs, such as

http://yourdomain/path/imagename.jpg

MWeb assumes that this URL is for a full-size image, and the image will be resized whenever a thumbnail is needed. If you store only a partial path to images, see the next section.

MARC Databases

MARC files with images may use only the full-URL approach. Images may reside on the same server as the Database, or on any server connected to the Internet. These image formats are supported: JPEG, GIF, and PNG.

MWeb implements the guidelines in Guidelines for the Use of Field 856 (March 2003) from the Library of Congress, namely that the MARC records follow this standard:

  • 856 1st indicator is "4"
  • The first 856 $u contains the full URI to the image and ends in ".jpg", ".gif", or ".png" (but not case-sensitive)
  • The first 856 $z contains the image description and the second contains the image credits or copyright*
  • Multiple images are in separate 856 fields

* You may put both into the first $z, but if they are in separate $z's, MWeb will display the second on a new line.

We will add the flexibility to depart from this standard in future versions if requested.

If you have only one size of images (that is, no thumbnails), the indexing program will detect images in the MARC files and will configure the system automatically. You may skip the rest of this section. However, if you have thumbnails, keep reading.

Relational Databases

For Relational Databases, you need to tell MWeb how to find the images. To do so, logon as the Administrator, then click Databases, then the Configure button for the Database, then the Configure Tables link. This shows the list of tables in the Database. Now click the Edit button next to the name of the table with images.

In the Edit display you will see three additional fields: Image Query, Thumbnail Pattern, and Image Pattern. These are the fields we will now work with.

Image Query

Let's begin with the Image Query field. First, construct a SQL query that tells MWeb how the image directories and filenames are found for a given record, like this:

     select url
     from pot
     where id='115'

Next, since MWeb expects that the field selected is named xurl, add that as an alias to the statement. In addition, you do not need the "where" clause since MWeb will add this automatically. Thus we have:

     select url as xurl
     from pot

If the image information is kept in a different table from the main table, you may have a statement like this:

     select image.url as xurl
     from pot, image
     where pot.id=image.potid

Here you need a "where" clause to link the two tables. Do NOT use table aliases such as "from pot p, image i".

Here's a more complex example using three tables, and using the xfile technique. This Database does not use directories, so there is no xpath. For the xdescriptor, see next section.

     select image.caption as xdescriptor, image.url as xurl
     from image, pot_image, pot
     where image.imageid=pot_image.imageid
          and pot_image.potid=pot.id

Image Descriptors and Credits

The Image Descriptor is a word or brief phrase describing the image, such as "side view", "profile", etc. It is displayed under the thumbnail or image in some displays, as provides the image "alt" attribute in all displays (an accessibility standard).

The Image Credit is the credit or copyright statement for the image. It is displayed under the full image in the Image Window.

The image descriptor and/or credit, if you use them, should precede tpath or tpath in the query, and use the field aliases of xdescriptor and xcredit, such as:

     select caption as xdescriptor, copyright as xcredit, url as xurl
     from pot

For an example of how Descriptors and Credits display, see this screenshot, in which "Christmas Card 2003" is the Descriptor and "Copyright © Trigram Studio 2002" is the Credit.

Thumbnails (MARC Databases and Relational Databases)

Since in the full-URL approach the URLs are to full images, when MWeb needs to display a thumbnail it will resize the full image to be 120 pixels on the longer side; the size is not currently a site option.

In Release 2.0 and later, you may change the thumbnail display size by using the Admin / Project menu buttons.

However, using a full image when a thumbnail is required is slow, since it increases the download time. Therefore if you also have (or wish to make) thumbnails for the images, you may tell MWeb about them in the Configure Tables display. Click the Edit button for the appropriate table or MARC file and enter data into these fields:

Thumbnail Pattern The pattern of the URL to retrieve the thumbnail
Image Pattern The pattern of the URL to retrieve the full image

Use these two fields to describe how to transform the URI to find the thumbnail. MWeb will substitute in the URI the words in the Image Pattern with the words in the Thumbnail Pattern.

For example, if the image URI is

     http://yourdomain/myproject/images/full/file.jpg

and the thumbnail URI is

     http://yourdomain/myproject/images/thumbs/file.jpg

put the word "full" in the Image Pattern and the word "thumbs" in the Thumbnail Pattern.

For another example, if the image URI is

     http://yourdomain/myproject/images/rarebooks/file.jpg

and the thumbnail URI is

     http://yourdomain/myproject/images/rarebooks/thumbs/file.jpg

leave the Image Pattern empty and add the word "thumbs" in the Thumbnail Pattern. In this case MWeb assumes that the /thumbs/ subfolder is always immediately before the image filename in the URI.

If you do not wish to display images

To prevent MWeb from displaying images, use the Configure Tables display to Edit the table and delete the contents of the Image Query field. Records with images will be retrieved if the searcher specifies "Only records with images", but the images will not be shown.

This will have to be redone everytime the file is reindexed.

Databases with Images, Part 2

MARC Databases -- skip this section
MWeb Enterprise Databases -- handled by wizard
Self-Hosted PastPerfect Databases -- image access is configured by the wizard if your images are stored in their normal subdirectories under the images path on the Database Server (for example, like /UDCimages/001/imagefile.jpg for full images and /UDCimages/001/thumbs/imagefile.jpg for thumbnails); if so, skip this section.
PastPerfect Databases on the PP Server -- handled by wizard
Relational Databases -- read this section

Any table or Subset in the Database may have images associated with it. Images may reside on the same server as the Database, or on any server connected to the Internet. These image formats are supported: JPEG, GIF, and PNG.

Read this section if you store partial paths to the images in your Database. This approach requires that you provide two sizes of images: full-size images and thumbnails. For example, the images might be stored on the server according to this layout:

It is common to use the same path and filename for both the thumbnail and the image, with the two sizes in subdirectories. For example, you may have a directory structure like this:

     Images
       |
       |- Department_x_images
       |    |
       |    |- full
       |    |    |
       |    |    |- Image1.jpg
       |    |    |
       |    |    |- Image2.jpg
       |    |
       |    |- thumbnail
       |         |
       |         |- Image1.jpg
       |         |
       |         |- Image2.jpg
       |
       |- Department_y_images
            |
            |- full
            |    |
            |    |- Image3.jpg
            |    |
            |    |- Image4.jpg
            |
            |- thumbnail
                 |
                 |- Image3.jpg
                 |
                 |- Image4.jpg

You need to tell MWeb how to find the images. To do so, logon as the Administrator, then click Databases, then the Configure button for the Database, then the Configure Tables link. This shows the list of tables in the Database. Now click the Edit button next to the name of the table with images.

In the Edit display you will see three additional fields: Image Query, Thumbnail Pattern, and Image Pattern. These are the fields we will now work with.

Image Query For MWeb Enterprise Databases and PastPerfect Databases on the PP Server, leave this field empty

For Self-Hosted PastPerfect Databases and Relational Databases, the SQL code to retrieve images for this table
Thumbnail Pattern The pattern of the URL to retrieve the thumbnail
Image Pattern The pattern of the URL to retrieve the full image

Image Query

Let's begin with the Image Query field. First, construct a SQL query that tells MWeb how the image directories and filenames are found for a given record, like this:

     select thumbfolder, thumbname, 
          imagefolder, imagename
     from pot
     where id='115'

Next, since MWeb expects that the four fields selected are named tpath, tfile, ipath, and ifile, respectively, add those as aliases to the statement. In addition, you do not need the "where" clause since MWeb will add this automatically. Thus we have:

     select thumbfolder as tpath, thumbname as tfile, 
          imagefolder as ipath, imagename as ifile
     from pot

If the image information is kept in a different table from the main table, you may have a statement like this:

     select image.thumbfolder as tpath, image.thumbname as tfile, 
          image.imagefolder as ipath, image.imagename as ifile
     from pot, image
     where pot.id=image.potid

Here you need a "where" clause to link the two tables. Do NOT use table aliases such as "from pot p, image i".

In this case it would be tedious (but permissible) to repeat them in the SQL query. Instead you may use xpath and xfile like this:

     select thefolder as xpath, thename as xfile
     from pot

(You will specify how to distinguish the two image sizes in the section below on Thumbnail Pattern and Image Pattern.)

Here's a more complex example using three tables, and using the xfile technique. This Database does not use directories, so there is no xpath. For the xdescriptor, see next section.

     select image.caption as xdescriptor, image.filename as xfile
     from image, pot_image, pot
     where image.imageid=pot_image.imageid
          and pot_image.potid=pot.id

Image Descriptors and Credits

The Image Descriptor is a word or brief phrase describing the image, such as "side view", "profile", etc. It is displayed under the thumbnail or image in some displays, as provides the image "alt" attribute in all displays (an accessibility standard).

The Image Credit is the credit or copyright statement for the image. It is displayed under the full image in the Image Window.

The image descriptor and/or credit, if you use them, should precede tpath or tpath in the query, and use the field aliases of xdescriptor and xcredit, such as:

     select caption as xdescriptor, copyright as xcredit,
          thumbfolder as tpath, thumbname as tfile, 
          imagefolder as ipath, imagename as ifile
     from pot

For an example of how Descriptors and Credits display, see this screenshot, in which "Christmas Card 2003" is the Descriptor and "Copyright © Trigram Studio 2002" is the Credit.

Thumbnail Pattern and Image Pattern

MWeb needs to know how the full URLs to the images are constructed. This is the purpose of the fields Thumbnail Pattern and Image Pattern. These are simply URL patterns that reflect the structure on the server for this Database. Here's an example of a Thumbnail Pattern:

     http://yourdomain/myproject/images/tpath/thumbs/tfile.jpg

Here's an example of an Image Pattern:

     http://yourdomain/myproject/images/ipath/full/ifile.jpg

Notice that this pattern uses the tpath and tfile terms you used in the Image Query. In the pattern it is required that tpath and tfile have punctuation on both ends. In other words, don't use words like "datpath" or "tpath03" that contain "tpath" within them.

If you store image filenames including their extensions, such as ".jpg", this example would be:

     http://yourdomain/myproject/images/tpath/thumbs/tfile

Here tfile is still a complete word since it ends the string.

If you do not use directories to organize your images, just leave tpath and ipath out of the SQL for the Image Query and the Patterns. MWeb will automatically adjust the URL.

Both Thumbnail Pattern and Image Pattern are required, even if you use the xpath/xfile technique. In addition, use tpath, tfile, ipath, and ifile in the patterns, not xpath and xfile:

     http://yourdomain/myproject/images/tpath/thumbs/tfile.jpg
     http://yourdomain/myproject/images/ipath/full/ifile.jpg

Advanced Relational Topics

MARC Databases -- skip this section
MWeb Enterprise Databases -- skip this section
Self-Hosted PastPerfect Databases -- skip this section
PastPerfect Databases on the PP Server -- skip this section
Relational Databases -- read this section

Normally to display a Full Record MWeb uses the data in a single Database table. However, Subsets in a Relational Database are often more complex than that, and may require gathering data from more than one table (such as basic data in OBJECT table plus artist data in a PERSON table). In order to display such data, you may describe to MWeb how to construct the full record for each Subset, by adding the SQL (Structured Query Language) select statement required to retrieve the data fields you want displayed. The SQL statement goes into the Full Record Query field in the Configure Tables display.

For example, on the MWeb Universal demo site, one of the Databases is for the Trigram pottery studio. A full record for a pot consists of data from the POT, CLAY, and GLAZE tables. Combining these requires four tables. Here is a sample record:

POT table
idseriestitledescriptndate heightdiameterlengthcnum
66otherworkTall bottlecoiled3-01 12.5  C
CLAY table POTGLAZE table GLAZE table
idname
CSoldate 60
pnumgnum
665b
667a
idname
5bBLACK
7aTRANSPARENT

Here is the actual SQL query that links these four tables to display the full record shown on the MWeb Universal demo site:

     select pot.id, pot.series, pot.title, pot.descriptn, pot.date, 
          clay.name, pot.height, pot.diameter, pot.length, glaze.name 
     from pot left outer join clay on clay.id=pot.cnum 
          left outer join potglaze on potglaze.pnum=pot.id
          left outer join glaze on glazeid=potglaze.gnum

The Full Record displays like this in MWeb. MWeb inserts the name of the table each field comes from so it is easier to understand:

POT_ID66
POT_Seriesotherwork
POT_TitleTall bottle
POT_Descriptioncoiled
POT_Date3-01
CLAY_NameSoldate 60
POT_Height12.5
GLAZE_NameBLACK
GLAZE_NameTRANSPARENT

Points to note about this query:

  • MWeb uses standard SQL queries. If you know SQL and know your database structure you should have no trouble.
  • You may specify the order in which the fields display in the full record by their position in the SQL query.
  • For this pot, the diameter and length are empty, so they do not appear in the MWeb Full Record.
  • The fieldnames displayed do not have to be those actually used in the tables; for example, the POT table has a "descriptn" field but this displays as "Description". (See Configuring Fields above for how to do this.)
  • Likewise the fieldnames do not have to match between tables; notice the example said "clay.id=pot.cnum" -- we call the clay code "id" in the CLAY table but "cnum" in the POT table.
  • The purpose of the POTGLAZE table is so there can be more than one glaze for a pot. A linking table like this is used so we do not have to put the codes into the POT table -- because no matter how we did this, it would be troublesome. Either we would have to decide up front the maximum number of glazes that could be used on a pot and create that many fields; or we would use some clumsy repeating code like "5b;7a" which would have to be parsed.
  • The purpose of the "left outer join" statements is to join the four tables together. The "left outer" takes care of situations where the pot has no clay code or glazes; otherwise the record would not be retrieved (this is a characteristic of the SQL language).

MWeb uses the SQL queries during indexing as well. This means that words in related tables are indexed as if they belong to the main table for a Subset. In the example above, a search on the word "soldate" will retrieve pot 66, even though the word appears only in the CLAY table, not in the POT table.

We realize this is not exactly beginner's fare. And for complex records this can be a tedious process to get right. You may require a database person to help with this. You may find it easier to use another tool to experiment with the SQL, then copy it to MWeb when it is working.

Do not use table aliases in your queries, such as in this incorrect example: select p.id, p.name, pg.glazename from pot p, potglaze pg where pg.potid=p.id

You may use * in a query. This retrieves only the fields whose Full Record Sequence field (set during Database configuration) is not zero. This allows you to quickly specify that you want all the displayable fields in a table without writing long SQL statements. This feature also causes the fields to display in the order determined by their Full Record Sequences.

More about MARC Databases

What Is a MARC Database?

A MARC Database consists of one or more files with these characteristics:

  • All files have the same format. At present they may be MARC21 Bibliographic or MARC21 Authority. Later releases of MWeb Universal will add UNIMARC and MARCXML.
  • All files have the same encoding, either MARC-8 or UTF-8.
  • Files are strict MARC. Carriage returns, line-feeds, and other illegal characters will confuse the indexing program. Later releases of MWeb Universal will relax this restriction.
  • We are not aware of any file-size limitation. The largest file we have tested with is 2 GB. If you have a longer file, we would be happy to test it for you before you purchase MWeb.
  • There is no limit to the number of records.
  • Records and fields are limited to the MARC limits of 99,999 and 9,999 octets, respectively.
  • UTF-8 files may contain the entire Unicode character set.
  • Records may contain local-use tags; see below for how these are handled.

Standard tags

When you install MWeb Universal, information about all the standard tags for the format (bibliographic or authority) is included. Settings for these tags are as follows; you may change these settings using the method described above under Configuring Fields.

  • In Full Records, MWeb displays the fieldname followed by the tag, such as "Control Number 001". At present this is not configurable. (debug)
  • In Full Records, fields are displayed in the order they appear in the record. You may prevent a field from displaying in Full Records by changing its Full Record Sequence to 0.
  • In Search Results, MWeb displays the fieldname as a column heading.
  • Fields are displayed using the standard names as assigned by the Library of Congress. These can be awkward, such as "Fixed-Length Data Elements--Additional Material Characteristics" for the 006.
  • Most fields are set to be indexed. They are assigned Search Categories based on their fieldnames.

Local-use tags

MWeb Universal can index, search, and display data in local-use tags, including non-numeric tags. These tags are discovered whenever files are indexed; therefore you do not have to add these to MWeb manually. However, after local-use tags are discovered by the indexing program, you may wish to change the default settings MWeb gives them:

  • Fieldnames are assigned as "Tag xxx", where xxx is the local-use tag. Since fieldnames in Full Records include the tag also, these are displayed as "Tag xxx xxx" in Full Records. The assumption is that you will assign a more meaningful name to the local-use tag.
  • All local-use fields are set to be indexed. They are assigned the Search Category of 9, which when MWeb is installed means "Remarks". If you have removed or redefined Search Category 9, or wish the local-use fields to have other Search Categories, you may change them and then reindex.


All contents of website, including HTML and JavaScript, copyright © 1996-2008 Systems Planning. MWeb, MARCView, MARConvert, and InFORMer are trademarks of Systems Planning.

Systems Planning
4915 Redford Road
Bethesda, MD 20816 USA
(301) 652-1231
info@systemsplanning.com (Including the name of one of our products in your message will bypass all spam filters)