Homepage MWeb Online Catalogs - Overview - MWeb Enterprise - MWeb Universal - - Beta Program - PastPerfect-Online - Compare Versions - How to Buy - Examples - Articles - MWeb News - Support - Documentation MARCView MARConvert Contact us

Email icon Sign up for MWeb News or MARCView/ MARConvert News

We send brief announcements by email a few times a year. We do not insist on a lot of contact information, and we will not share, sell, rent, or lose what you do provide. You may unsubscribe at any time. Please add systemsplanning.com to your spam-filter's whitelist so the news will get through!
Privacy Policies

MWeb Catalogs for Large Systems

Contents


Introduction

Whether you prefer the term "integrated searching" or "federated searching", MWeb Enterprise and MWeb Universal provide two solutions for mounting a flexible web-based search engine and interface for data, images, and media. They can integrate any number of databases, in any format, into a single real or virtual database that can be searched as a whole or in parts. This makes MWeb ideal for consortia or enterprise-wide solutions.

Throughout this document, if we use the term "MWeb" without a qualifier it means both Universal and Enterprise, since they share many features. The primary difference between them is that Universal searches a virtual database (multiple databases on multiple servers), whereas Enterprise searches a single real database created by a batch process from multiple databases. This page discusses this and other differences in detail; in addition there is a tabular comparison.

The data used by MWeb must come from other systems. MWeb has no editing capabilities, and does not and cannot modify its source databases.

MWeb does not include any hardware or system software -- the client provides the Windows web-servers (which can be shared with other applications if necessary). We are developing a version of MWeb Universal to run on Linux servers.

Screenshots and examples relating to most of the topics discussed herein may be found on the specific pages for MWeb Enterprise and MWeb Universal.

Integrated Access to Any Number of Databases

Capitalized terms in our documentation indicate terms with precise meanings. The first time a term is used it is bolded. Please see the MWeb Glossary for more detail.

Both Universal or Enterprise allow each contributing department or consortium partner to continue to use its existing CMSs, data models, table structures, fieldnames, etc. MWeb works like a "black box," converting all source data to a unified model.

In the nature of things, this has some limitations. First, MWeb cannot create or access data that is not there; for example, if one contributor's database does not have a certain field, MWeb cannot create it or access it, so searches on that field will not include records from that contributor. Second, each contributor probably has different cataloging standards; there is some discussion below about how MWeb can help with this problem.

MWeb Enterprise uses its own MWeb Enterprise Database rather than accessing your internal content-management systems or collections-management systems (CMSs). The MWeb Enterprise Database is generated by running a batch process we provide called the MWeb Preprocessing System (PPS). Although there are several advantages to this, it is done primarily in order to integrate diverse data models by converting them to a common data model. For example, if you build a database from museum, library, and archive records, each type of data can come from a different CMS, and each type can preserve its own data fields and field names. The MWeb Enterprise Database preserves metadata with the data, similar to the way XML and MARC formats do.

The MWeb Enterprise Database can be built from almost any kind of files: relational databases, delimited text files, and MARC files have all been incorporated. We will add the capacity to read XML or EAD records when a client requires it. Normally these files are generated from each CMS or internal system and PPS builds the MWeb Enterprise Database from these exported files. On a fee basis, we may be able to modify PPS to read data directly from your CMSs, thereby saving you the export step.

In contrast to Enterprise, MWeb Universal does not build a master database but operates using Open Database Connectivity (ODBC). (It can also read MARC files directly.) However, this means that the original data is being indexed and displayed, unlike in Enterprise where the Proprocessor adds value through extensive cross-linking, controlled vocabularies, and other enhancements.

MWeb Universal searches distributed databases in such as way that they appear to be a single database. It fulfills the same need as Z39.50 and similar protocols, but if much more flexible and requires far less implementation effort on your part. It is also much, much cheaper.

Convenience, Performance, and Security

Additional database considerations

Besides the tighter integration of data models provided by the MWeb Enterprise Database, it has other advantages as well:

  • Higher degree of security, since the live databases need not be accessible over the web. You can be sure that unauthorized users have absolutely no way to hack into your internal systems. (We plan to develop a means to copy live data so that Universal can share this advantage.)
  • The MWeb Enterprise Database is faster, since is designed and tuned for optimum retrieval. At the same time, your internal systems are not burdened by queries coming from outside the organizations.
  • PPS can integrate authority files and controlled vocabularies with the records they refer to, so that terms display as links to the authority records. In addition, we can program PPS to generate controlled vocabularies from fields in your Primary Records.
  • It is easier to load only the records you wish. For example, some records may not be ready for public scrutiny. Or a library may belong to several consortia for different subject areas or forms of material. It can be hard to control remote access to just the desired records, whereas it is easy to export just those records.
  • We can modify PPS to manipulate the data being loading in ways that cannot be done using Universal's data model.

The disadvantages of the MWeb Enterprise Database are the necessity for each contributing department or partner to send data to a common point for batch processing, and the fact that a batch-generated database is never totally current.

Performance

All else being equal, Enterprise is faster than Universal, as it is accessing a single database on a single server. Enterprise also uses a single program for database access and display. Universal typically accesses each database on a separate server, and the database-access programs must communicate with the program handling the user interface. It is a tradeoff between performance and flexibility.

Data and image security

MWeb Enterprise offers three options for logon: no logon required; everyone must logon; or no logon for the public but privileged users or staff may logon to access special functions, data, or images. Up to nine levels of privilege are supported. Logons are controlled by a unique User ID and an encrypted password. Privileges may also be granted based on IP address.

Security is extremely granular. Both the security level and the IP restriction can be applied to individual records, data fields, values, images, or image sizes, as well as to entire Content Types.

A full description of the MWeb Security Model may be found here for Enterprise.

Security is a future feature for Universal.

Customizable User interface

Both versions of MWeb can be customized. For Enterprise, the vendor performs the customization. The possibilities are limited only by the boundaries of sound user-interface design and cross-platform compatibility. We assist with both these aspects, but otherwise you and your designers are in control. Simply send us the required design and we do the rest. We will work with your designer if necessary as we implement the design.

However, changes to text messages or text buttons do not require the vendor; these can be modified by the system administrator using a web browser.

Your MWeb Enterprise Database can also be served by the MWeb Enterprise XML Server, which means you can access the Database using Flash, ASP, PHP, or any other technology. We are developing an open-source prototype called MWeb Flash to demonstrate this.

In contrast, MWeb Universal ships with a range of layouts, color schemes, messages, and other interface features that can be selected from a simple browser application.

But in addition, because Universal is controlled by CSS and and XSLT stylesheets, you can dramatically redesign the interface if you are familiar with those technologies. At that point Universal becomes a search engine that you can style as you wish.

For Accessibility issues, please see the Accessibility page.

Displaying Data

Search Results (brief records) and Full Records are displayed in your choice of several formats. The specifics depend on the version of MWeb and the user's query. In Universal, the formats are controlled by CSS and XSLT files; in Enterprise we code your choices into the program.

When MWeb displays a Full Record, it also displays the images (usually thumbnails) linked to the record, and an icon or link for each document and media item. Clicking on a thumbnail displays the full image. Clicking on media icon plays the media or shows the media Full Record (site option). Clicking on a document link displays either the document text or the document Full Record (site option). Any sort of images or media can be linked, but of course the user must have a suitable player in order to see them. (Media and document links are not yet available in Universal.)

Underneath each thumbnail or icon in the Full Record is displayed descriptors you specify; for example, under a thumbnail you may wish to display "detail", "1938 revision", or some other phrase individuating the thumbnail. Under a media icon you may wish to display "narration by Robert Smith", or some similar phrase.

The client provides the images and thumbnails. We have planning documents that help with determining the optimum size of these. The client provides the document and media icons also, or we can have them made by a designer.

For documents and media, you have three choices for icons:

  1. Use a single icon for all documents and media (the descriptors would help the user understand what the icons represent).
  2. Use a different icon for each type of document and media, such as PDFs, videos, sound, etc.
  3. Use a unique icon for each document or media item. You must supply these, so this can be quite time-consuming.

Image Viewer

MWeb Enterprise includes our famous Image Viewer (IV). Whenever the user clicks on a thumbnail image anywhere in MWeb, the corresponding full-size image opens in the IV. Each image is added to those already in the IV, to permit detailed study and comparison. The user may enlarge, reduce, drag, or remove images. The user can choose whether to display brief or full data under each image.

We have recently added a Slide Sorter feature to the Image Viewer that makes each image small and "tiles" them so there are no overlaps. The user can then change their order by dragging them and the other images are repositioned. This new layout can be saved like any other layout.

MWeb Universal also allows any number of images to be viewed simultaneously. But instead of using the Image Viewer, when the user clicks on a thumbnail image, the corresponding full-size image opens in a new browser window. These windows can be moved and resized, but the images cannot be resized. As in Enterprise, the user can choose to display either brief or full data under each image.

For most users the two approaches are equally useful. However, for those whose focus is primarily the image, the Image Viewer provides a clean, uncluttered worksurface for study and comparison, with the additional ability to resize the images to view detail or to see more images at once than would be possible otherwise.

Hardware and Software Requirements

The client is responsible for providing a server running Windows Server, plus IIS or other web-server software. For Enterprise a single server is required; in contrast, the MWeb Universal Main Module runs on one server and its Database Connectors can run on that server or any number of other servers. (Of course, all servers must be connected to the Internet.) MWeb requires no special capabilities that are not provided with Windows, including database functionality (unless you wish to upgrade from the standard product). We attempt to keep the demand on the server low: we optimize the data model and software for this purpose. For example, Enterprise single-term searches require no joins.

Server requirements may be found here.

The capacity of the server should be based on the estimated usage. We are not experts in sizing servers, so you should consult your technical staff on this. Keep in mind that searches are database-intensive, so the more memory there is the better performance will be.

For MWeb Enterprise, disk capacity required depends on the nature of the data to be loaded, but about 5 to 7 KB per record is a reasonable estimate for all data and indexes. To this must be added space for images, documents, and media, which will probably be far more. We can help with these estimates.

Both versions of MWeb can access images, documents, and media that are stored on any number of other servers, to avoid having to make copies of these. For example, each partner to a consortium may already have images on a server, in which case there would be no need to load copies onto the MWeb server -- this can save much time and effort. Since the images and media are accessed by URI, the other servers need not be running Windows. (Of course, the images and media must be on servers that can be reached over the Internet.)

Both versions of MWeb are standard CGI applications. They know nothing about firewalls, proxy servers, or networks. Configure Windows, IIS, and your network according to your requirements.

As discussed above, MWeb Universal will be able to can search any database with an ODBC driver, as well as being able to search MARC library files and other MWeb databases. MWeb Enterprise uses a relational database management system (RDBMS) to store the MWeb Enterprise Database. This can be one of several RDBMSs on the market (details).

Rebuilding the MWeb Enterprise Database

The MWeb Enterprise Database, being read-only, needs to be rebuilt periodically with new and changed data. This is a straightforward process:

  1. Export the desired data from the CMSs and send it to a central point.
  2. Run the MWeb Preprocessor (a Windows application) which we provide you.
  3. Send the results to the web-server (using FTP or another method).
  4. Send any new images, documents, and media to the appropriate server.

Only step 1 is at all difficult. It requires your knowing the data model of your CMSs so that the desired fields and records can be output. Most MWeb clients use the SQL language for this, outputting the results to text files or Access databases. Once the SQL is written, it can be rerun for each update without modification. In addition, MWeb can import MARC records. We will develop the capacity to read XML or EAD records when required by a client.

For a consortium, each partner would send its exported files to a central point where the MWeb Preprocessor (PPS) would be run to build the MWeb Enterprise Database.

You may update the database regularly or on no fixed schedule, according to your needs. Input data from various sources need not be contributed on the same schedule; if new files are not received from one source, just use their previous files; their data will be older than the others', but it will be there. There is no need to burden yourself with complex scheduling in most cases.

Related content, such as images, documents, and media, are not loaded into the MWeb database but are loaded as references. For example, a record may contain references to images, documents, and media; these references are the URI or the path and filename of the object, depending on whether the materials are on the same server or other servers.

PPS is a Windows application run on a desktop computer. PPS should not be run on the web-server; building a large database will have a serious impact on performance of the website.

Metadata

MWeb Enterprise's flexible data model can store any sort of information. Although we cannot describe this for business reasons, we can say that it stores metadata with the data, as do the MARC and XML formats.

With both Universal and Enterprise, metadata describing a digital asset, such as date-of-image-capture, can be searched like any other data. Or not, if you prefer: each MWeb client determines what fields are loaded and indexed. The client also determines what fields will be shown, in what order, and what they will be called.

With regard to data related to digital assets, such as the street address of a building in a photograph, MWeb treats this the same way -- it is simply data to be stored, indexed, searched, and displayed as you specify.

Controlled Vocabularies

If you wish, both Universal and Enterprise can search your existing authority files or controlled vocabularies (such as artist names, placenames, style names, periods, or subjects).

Enterprise provides more flexibility. Because the MWeb Enterprise Database is created by PPS, the authority terms can be displayed as links in the Primary Records, providing a way for users to explore the database by navigating through the links. In addition, we can program PPS to generate controlled vocabularies from fields in your Primary Records.

Administration and Configuration

MWeb requires little administration apart from normal software upgrades and server and database maintenance. Here are the tasks you may expect to do:

  • For Enterprise, you will need to extract new data from your CMSs periodically, as input to PPS. Consortium partners will send the data to the partner responsible for running PPS.
  • For Universal, you will rebuild the MWeb index periodically as databases change (a one-click operation).
  • You will need to add to the server any new images, documents, and media, unless these are to be stored on separate servers.
  • The system administrator should look at your server logs once a month to detect problems users may be experiencing.

Some special features or installations may require additional tasks:

  • If you have forms such as a user survey (Enterprise only), you will want to look at the survey results periodically. You will probably want to erase the results and start over periodically.
  • If you are capturing the searches that users perform (Enterprise only at present), you will want to run the MWeb Search Report periodically. This report is an HTML page that may be printed or otherwise manipulated. It shows a number of statistics about search activity on your site, with monthly and annual summaries.
  • If you have Staff Logon, you will need to add and remove names from the logon table for new or departing staff. (There is also a way for the public to add themselves if you do not wish to control this.)
  • You may wish to modify messages using IMS, or (for Universal) make other changes to the user interface.

Collaboration Features

By its ability to integrate any number of data sources in any format, MWeb provides the catalyst for collaboration. Each partner may use a different data model, different content, different fieldnames, etc. MWeb presents them all in a consistent manner.

The integrated database (whether Enterprise's real database or Universal's virtual one) may be searched as a whole, or searches may be limited in various ways.

Data Integration

The largest problem in any consortium database is intellectual integration -- integrating the terminology so the database presents itself consistently.

MWeb Enterprise can help provide consistent data by using authority files or controlled vocabularies (such as artist names, placenames, style names, periods, or subjects). This can be done in one of two ways: Either the Preprocessor (PPS) can load your existing authority files, or it can create authority files from data in other records (for example, extracting artist names from records for artworks).

Term Insurance™

We have recently begun to address the problem of integrating authority files from different data sources or organizations, which can be a problem if they do not use the same cataloging rules. In libraries this is less of a problem than in museums, as libraries generally use the same sources of name and subject headings. Merging library and non-library data will almost certainly present problems.

We are planning a new MWeb subsystem to be called Term Insurance™. It will provide a "crosswalk" between multiple sets of terminology. The results will be tables that will serve as input to the MWeb Preprocessor (PPS); PPS will use these to provide cross-references or to add additional terms to Primary Records.

We are now looking for a museum or consortium to work with us on this project to ensure that it is grounded in real needs.

Automatic matching

We plan for Term Insurance to find relationships between terms using a combination of lexical methods. These will be employed in combination. Some matching methods will apply to terms as a whole, such as:

  • Exact match
  • Truncation, such as dropping of life dates for persons
  • Contained phrases (to match "religious life" and "<placename> -- religious life and customs")
  • Normalization of spacing ("muleskinner" and "mule skinner")
  • Normalization of capitalization ("del Monte" and "Del Monte")
  • Initials standing for names ("Smith, J" and "Smith, John")
  • Extended names ("Sullivan, Arthur" and "Sullivan, Sir Arthur" and "Sullivan, Arthur, Sir")

Other methods apply to words within terms:

  • Singular/plural match
  • Vowel substitution (to match "woman" and "women")
  • Vowel collapsing ("color" and "colour")
  • Consonant substitution ("Cayuse" and "Cayuze")
  • Consonant doubling ("pelise" and "pelisse")
  • Omission of one character ("Frederic" and "Frederick")
  • Normalization of punctuation ("Renee" and "Renée")
  • Character transposition ("Stein" and "Stien")
  • Stemming ("religion" and "religious")

For complex terms, a combination of several methods from both groups will be required, such as to match "religious life" and "<placename> -- religion".

We may also add a method of indicating the closeness of the relationship, which could be later used by MWeb to rank search results.

Editing

After Term Insurance makes all the matches it can do algorithmically, editors may wish to input additional terms, make or change relationships, make or change closeness codes, or make other changes.

For this purpose, Term Insurance will include an editorial interface providing various views of the data and methods of changing it.

The development of Term Insurance depends on approval of certain grants. You may choose to fund this development if it is critical to your needs.


Other Topics

Installation and Support

We provide generous support in implementing and maintaining your MWeb site.

  • We provide extensive documentation for you and your designer, technical staff, and other people involved. These documents are on the Web where they can be kept up-to-date. You may print them or store them locally if you wish. They include detailed instructions for installing MWeb on your server, running the Preprocessor on-site, reviewing statistics, and other tasks.
  • MWeb Enterprise comes with a context-sensitive Help file for the end-users, linked to buttons in MWeb. You may replace or supplement this Help with your own, or we can provide a custom Help file on a fee basis.
  • We provide unlimited support to you and your technical staff. (We do not support your end-users.)
  • Specialized data editing, cleanup, and conversion services are available, both manual or automated.

Multilingual Data and Interface

MWeb can store and display any Unicode characters including Chinese, Japanese, Korean, Hebrew, Arabic, Tibetan, etc. A field may contain characters from any number of languages. Only Latin characters are indexed at present; please let us know if you require other character sets indexed.

MWeb Enterprise has built-in support for a multilingual interface, which can be in any Unicode characters. (However, we have not tested right-to-left scripts.) We provide lists of messages and buttons for you to have translated. (System error messages are in English only.) There is an increase to the annual support cost for each language other than English.

For reading non-Latin data and interface elements, note that the potential users must have the required fonts on their computers -- but people that read those languages usually do.

Pricing

Current prices are shown on our Pricing page. We would be happy to assist with estimating costs when we have seen sample data and know more about your requirements.

For consortia, our policy is to contract with and bill to a single institution. This would be the consortium itself if it is a legal entity; if not, we will contract with and bill a designated partner.


All contents of website, including HTML and JavaScript, copyright © 1996-2008 Systems Planning. MWeb, MARCView, MARConvert, and InFORMer are trademarks of Systems Planning.

Systems Planning
4915 Redford Road
Bethesda, MD 20816 USA
(301) 652-1231
info@systemsplanning.com (Including the name of one of our products in your message will bypass all spam filters)