Metadata or descriptive
information (like title, author, category,
update date...) about a document can only
enrich the search experience if used
correctly. Some content providers have
categorizers on staff who associate metadata
to documents, but this can be very costly.
Some search engines can create some metadata
for documents. Other specialized software
packages (Entity extraction tools) can add
additional metadata.
Structured vs. Unstructured -
Approximately 80% of searchable documents
have no metadata associated with them and
are "unstructured". The remaining 20%
have some metadata (or structure) and
advanced techniques can be used on these
structured documents to make finding
information in them easier.
Faceted (Parametric Search) -
Faceted or Parametric Search is a technique,
supported by many search engines where some
combination of metadata and taxonomies
combine features of Browse and, Navigation
with Search. Many Search engines enable
search, browse and drill in of content in an
integrated manner facilitating discovery of
information.
Indexing/Collection Build -
This occurs when content is put directly
into a "collection" or repository for direct
searching. Content may be from Content
Management Systems, file systems, databases,
subscription feeds, RSS, the internet and
from a variety of other sources. This
type of search is the most optimized for
performance and scalability.
Federated (meta) Search - A
federated search capability (or searching of
multiple sites at one time where native
search capabilities are used) is most often
built when it is unfeasible to index content
from sources because of volumes,
accessibility, security, update frequency,
license (or use) restrictions or other
reasons.
Classification Tools - Some
search engines support document
classification. User Created Rules
(Taxonomies) as well
as Dynamic (or on the fly classification)
can be used to categorize documents.
Entity Extraction - Created by
search or third party vendors, these tools
add additional metadata (or attributes) to
documents. Some vendors have created
specialized modules specifically for
different vertical sectors (e.g. financial,
pharmaceutical, banking, insurance...) which
can add even more value when integrated with
search
Enhanced Functionality - The
most powerful search applications can add
value for users with extra features
including:
-
Alerting
-
Recommendation tools
-
Saved Searches
-
Sharing of queries and notations
-
Graphical navigation and tools
-
Trending
-
Use of synonyms/controlled vocabulary in
search
-
Search Wizards and other advanced search
techniques
-
Context dependent hyperlinking
-
Integrated Desktop search
-
Role Based applications
RTI uses
SIFT
to develop
advanced Feature Rich search applications
quickly and on budget and to gather
data from a variety of sources, including
your Collections (multiple search engines), File Systems, Web pages,
Exchange Public folders, Documentum, eRooms,
Symantec Enterprise Vault, Databases (Oracle, SQL Server, and Sybase) and
other sources. Additional sources can be added
if required.