Quickly finding files in any office setting is critical to productivity. Sifting through hard drive after hard drive, server after server for a singular file takes employees away from completing the rest of their work. To improve the speed of the search of data, different methods are available. Going beyond the traditional localized search method, a business may choose to rely on alternative data search methods, such as federated search to assist in the finding of information across the enterprise. A federated search can create added value for enterprise search, so having a thorough understanding of how each works is helpful to get a clear picture of when federated search could and should be used.
Before understanding how federated search creates value, it is necessary to outline how each functions independently. A federated search is also referred to as universal search. When a user performs a search, the query will return results from multiple data sources in a unified result set.
While it does depend on the search engine platform, the engine searches through varying data sources (such as locally stored hard drives, server blades, cloud storage, remote databases, etc) and connector codes to provide a detailed list of desired files, data, and locations of the information that was being searched. While federated search does universally scan over varying data sources, it can struggle with providing accurate search rankings (often times there are similar, if not identical file names stored on different content repositories), so it may take some sifting through the data to find the exact information desired.
Enterprise search provides a host of search features and functionality, although these search features are dependent on the meta data included when first logging and indexing the information. Meta data goes further into detailing what a particular document includes. When searching for a particular piece of information, the meta data provides context to the searcher in order to describe of the contents of the data. The meta data tries to explain what the record means. When any content is saved, the ability to provide additional meta data is usually available. The included meta data may vary based on what the content is used for. Meta data may include anything from who viewed the information, the department using the information, who created the content, and the content's purpose. The included meta data can vary based on the needs of the company and the type of content. For instance, meta data for an event might include the location, the speaker, and what time the event is being held. This is similar to how meta data works in enterprise search.
While performing an enterprise search, there could be many different drop-down and filtering and menu options for locating a specific piece of information. The initial search is similar to other search methods as it relies on basic keywords. From there, fine tuned search features come into play. The filters and drop-downs would use the meta data as methods to reduce the number of results that match the meta data chosen by the user.
For an additional, in-depth outline of enterprise search, make sure to check out the previous post on enterprise search, its benefits and how its implementation may improve productivity.
Federated search will require additional indexing of information. This means when a record is saved it needs to be indexed locally or in the search database. This storage could be in the cloud, of course, but the point is that the data is not referenced in the original location so that it has to go get the data at the time that the search is performed. To do this, the entire record must be loaded to the local server or storage area. Doing this requires a large amount of bandwidth in order to handle the indexing. The data itself is saved, just as it would be when saving a standard piece of information to a computer. However, different records save in different formats, which can make indexing challenging. Think of it as a library indexing books, DVDs, newspapers, magazines and audio files, and database records in the same section and trying to make sense of them in order to present them in a useful way to the user when they are performing the search. Doing so would make finding what you want difficult and time consuming if the indexing wasn't done properly. In order to index the data, the information and metadata of the record is converted into a singular format. This way, no matter the original format of the data, it can be identified and presented during enterprise search. Doing so unifies data from other sources in order to provide a unified search.
As data is modified as necessary, including possible data enrichment, and then re-saved. Some additional indexing, scanning and formatting of the data might be required. As more and more information is processed, indexed and saved, the system uses more and more resources. This can create some strain on the system resources including CPU utilization and memory. While the purpose of an enterprise search and the use of federated search helps join different data sources together, because of this concern for resources, only information that will actually be searched should be indexed and saved. The enterprise search team should carefully evaluate what data is being saved and indexed by the search engine to minimize the exponential growth of data that could happen without a proper process in place.
Federated and enterprise search each provide a useful benefit for locating information on expansive servers and data networks. Additionally, each has a very specific utilization for identifying a desired file. However, each has its own shortcomings. While a federated search can scan varying data storage points with specified keywords, it does not have the ability to rank each search based on probability, nor does it have multiple search methods. On the other hand, enterprise search crawls the meta data of a file, which in turn makes it possible to pinpoint the exact file using varying search criteria, yet it generally only works in a localized setting and not through expansive corporate data networks. By overlapping the two search methods, it becomes possible to join the two options together.
In addition, each data source could has its own firewall or systems to safeguard information. The federated search needs to be able to get access to the data sources by getting access to the firewall and security systems. This means that a process for allowing the federated search to gain access to secure data for every data source needs to be established in advance. Doing this as part of an enterprise search project saves significant work down the road by not requiring users to log into multiple systems to gain access to the data individually. Care should be used to only provide access to data that will not be sensitive and that the users performing the search have access to. Ensuring that confidential data is secure is still necessary.
As noted previously, it is important to note that a federated search will add additional demand on resources that are performing the indexing, storing the search data, and delivering the enterprise search results to the end user. This strain to the resources is both acceptable and worth the performance complication due to the increased benefit of searching multiple sources at the same time without having to perform individual searches at each data source directly.
As a business grows, so too will its dependence on data. The data storage needs will expand right along with the company, making it difficult to locate specific information with standard search methods. Under a singular firewall, network, or data source, enterprise search gives users the ability to perform varying searches based on metadata in order to find the desired information. However, once the company expands out to multiple offices and data sources in different location (even in the same building), using a single enterprise search to cover the entire company's data footprint simply becomes impossible. With the inclusion of a federated search, companies can bridge this gap and join data from multiple data sources. At that point, the enterprise search can then be used to identify and present the results that the user requested based on the search criteria that was performed.