Apache Solr
Solr (pronounced "solar") is an open-source enterprise-search platform, written in Java, from the Apache Lucene project. Its major features include full-text search, hit highlighting, faceted search, real-time indexing, dynamic clustering, database integration, NoSQL features[2] and rich document (e.g., Word, PDF) handling. Providing distributed search and index replication, Solr is designed for scalability and fault tolerance.[3] Solr is widely used for enterprise search and analytics use cases and has an active development community and regular releases.
Developer(s) | Apache Software Foundation |
---|---|
Stable release | 8.7.0
/ November 3, 2020[1] |
Repository | Solr Repository |
Written in | Java |
Operating system | Cross-platform |
Type | Search and index API |
License | Apache License 2.0 |
Website | lucene |
Solr runs as a standalone full-text search server. It uses the Lucene Java search library at its core for full-text indexing and search, and has REST-like HTTP/XML and JSON APIs that make it usable from most popular programming languages. Solr's external configuration allows it to be tailored to many types of applications without Java coding, and it has a plugin architecture to support more advanced customization.
Apache Lucene and Apache Solr are both produced by the same Apache Software Foundation development team.
History
In 2004, Solr was created by Yonik Seeley at CNET Networks as an in-house project to add search capability for the company website.
In January 2006, CNET Networks decided to openly publish the source code by donating it to the Apache Software Foundation.[4] Like any new Apache project, it entered an incubation period which helped solve organizational, legal, and financial issues.
In January 2007, Solr graduated from incubation status into a standalone top-level project (TLP) and grew steadily with accumulated features, thereby attracting users, contributors, and committers. Although quite new as a public project, it powered several high-traffic websites.[5]
In September 2008, Solr 1.3 was released including distributed search capabilities and performance enhancements among many others.[6]
In January 2009, Yonik Seeley along with Grant Ingersoll and Erik Hatcher joined Lucidworks (formerly Lucid Imagination), the first company providing commercial support and training for Apache Solr search technologies. Since then, support offerings around Solr have been abundant.[7]
November 2009 saw the release of Solr 1.4. This version introduced enhancements in indexing, searching and faceting along with many other improvements such as rich document processing (PDF, Word, HTML), Search Results clustering based on Carrot2 and also improved database integration. The release also features many additional plug-ins.[8]
In March 2010, the Lucene and Solr projects merged.[9] Solr became a Lucene sub project. Separate downloads continued, but the products were now jointly developed by a single set of committers.
In 2011 the Solr version number scheme was changed in order to match that of Lucene. After Solr 1.4, the next release of Solr was labeled 3.1, in order to keep Solr and Lucene on the same version number.[10]
In October 2012 Solr version 4.0 was released, including the new SolrCloud feature.[11] 2013 and 2014 saw a number of Solr releases in the 4.x line, steadily growing the feature set and improving reliability.
In February 2015, Solr 5.0 was released,[12] the first release where Solr is packaged as a standalone application,[13] ending official support for deploying Solr as a war. Solr 5.3 featured a built-in pluggable Authentication and Authorization framework.[14]
In April 2016, Solr 6.0 was released.[15] Added support for executing Parallel SQL queries across SolrCloud collections. Includes StreamExpression support and a new JDBC Driver for the SQL Interface.
In September 2017, Solr 7.0 was released.[16] This release among other things, added support multiple replica types, auto-scaling, and a Math engine.
In March 2019, Solr 8.0 was released including many bugfixes and component updates.[17] Solr nodes can now listen and serve HTTP/2 requests. Be aware that by default, internal requests are also sent by using HTTP/2. Furthermore, an admin UI login was added with support for BasicAuth and Kerberos. And plotting math expressions in Apache Zeppelin is now possible.
Operations
In order to search a document, Apache Solr performs the following operations in sequence:
- Indexing: first of all, it converts the documents into a machine-readable format which is called Indexing.
- Querying: understanding the terms of a query asked by the user. These terms can be images or keywords, for example.
- Mapping: Solr maps the user query to the documents stored in the database to find the appropriate result.
- Ranking the outcome: as soon as the engine searches the indexed documents, it ranks the outputs as per their relevance.
Community
Solr has both individuals and companies who contribute new features and bug fixes.[18] [19][20][21][22]
Integrating Solr
Solr is bundled as the built-in search in many applications such as content management systems and enterprise content management systems. Hadoop distributions from Cloudera,[23] Hortonworks[24] and MapR all bundle Solr as the search engine for their products marketed for big data. DataStax DSE integrates Solr as a search engine with Cassandra.[25] Solr is supported as an end point in various data processing frameworks and Enterprise integration frameworks.
Solr exposes industry standard HTTP REST-like APIs with both XML and JSON support, and will integrate with any system or programming language supporting these standards. For ease of use there are also client libraries available for Java, C#, PHP, Python, Ruby and most other popular programming languages.[26]
See also
References
- "News". Apache Foundation. Retrieved 14 August 2020.
- "Archived copy". Archived from the original on 2014-07-06. Retrieved 2014-07-10.CS1 maint: archived copy as title (link)
- "Apache Solr -". apache.org. Retrieved 16 January 2017.
- "[SOLR-1] CNET code contribution - ASF JIRA". apache.org. Retrieved 16 January 2017.
- "PublicServers - Solr Wiki". apache.org. Retrieved 16 January 2017.
- "Apache Solr -". apache.org. Retrieved 16 January 2017.
- "Support - Solr Wiki". apache.org. Retrieved 16 January 2017.
- "Apache Solr -". apache.org. Retrieved 16 January 2017.
- "[VOTE] merge lucene/solr development (take 3) - Yonik Seeley - org.apache.lucene.general - MarkMail". markmail.org. Retrieved 16 January 2017.
- Solr3.1 - Solr Wiki. Wiki.apache.org (2013-05-16). Retrieved on 2013-07-21.
- Apache Lucene. Lucene.apache.org. Retrieved on 2013-07-21.
- "Apache Solr - News". apache.org. Retrieved 16 January 2017.
- "[SOLR-6733] Umbrella issue - Solr as a standalone application - ASF JIRA". apache.org. Retrieved 16 January 2017.
- "Solr 5.3 Release announcement". lucene.apache.org. Retrieved 2015-09-24.
- "Apache Solr - News". apache.org. Retrieved 16 January 2017.
- "Apache Solr - News".
- "Apache Solr 8.0 Release notes".
- "Highest Voted 'solr' Questions". stackoverflow.com. Retrieved 16 January 2017.
- "Lucene/Solr Revolution 2016". lucenerevolution.org. Retrieved 16 January 2017.
- "SFBay Apache Lucene/Solr Meetup". meetup.com. Retrieved 16 January 2017.
- "Oslo Solr Community". meetup.com. Retrieved 16 January 2017.
- "LinkedIn Solr Group". linkedin.com. Retrieved 16 January 2017.
- "Hadoop for Everyone: Inside Cloudera Search - Cloudera Engineering Blog". cloudera.com. 24 June 2013. Retrieved 16 January 2017.
- "Bringing Enterprise Search to Enterprise Hadoop - Hortonworks". hortonworks.com. 2 April 2014. Retrieved 16 January 2017.
- "DataStax Enterprise: Cassandra with Solr Integration Details". datastax.com. 12 April 2012. Retrieved 6 February 2017.
- "IntegratingSolr - Solr Wiki". apache.org. Retrieved 16 January 2017.
Bibliography
- Grainger, Trey; Potter, Timothy (March 2014). Solr in Action (1st ed.). Manning Publications. p. 664. ISBN 9781617291029.
- Smiley, David; Pugh, Eric; Parisa, Kranti; Mitchell, Matt (February 2014). Apache Solr 4 Enterprise Search Server (1st ed.). Packt Publishing. p. 451. ISBN 9781782161363.
- Serafini, Alfredo (December 2013). Apache Solr Beginner’s Guide (1st ed.). Packt Publishing. p. 324. ISBN 9781782162520.
- Rafalovitch, Alexandre (June 2013). Instant Apache Solr for Indexing Data How-to (1st ed.). Packt Publishing. p. 90. ISBN 9781782164845.
- Kuć, Rafał (January 2013). Apache Solr 4 Cookbook (1st ed.). Packt Publishing. p. 328. ISBN 9781782161325.
- Smiley, David; Pugh, Eric (November 20, 2011). Apache Solr 3 Enterprise Search Server (1st ed.). Packt Publishing. p. 418. ISBN 1-84951-606-5.
- Kuć, Rafał (July 22, 2011). Apache Solr 3.1 Cookbook (1st ed.). Packt Publishing. p. 300. ISBN 1-84951-218-3.
- Smiley, David; Pugh, Eric (August 19, 2009). Solr 1.4 Enterprise Search Server (1st ed.). Packt Publishing. p. 336. ISBN 1-84719-588-1.