Amazon CloudSearch | Search Features Flashcards
What are the best practices to accelerate domain configuration and re-indexing?
Search Features
Amazon CloudSearch | Analytics
When you change the configuration options of your search domain, you must rebuild your search index for those changes to take effect in search results. Rebuilding the index can take 30 to 60 minutes whether you make one configuration change at a time or several configuration changes at once. Even if your domain has only a small number of documents, re-indexing takes this time because of the processing and provisioning necessary to build the index and distribute it. Therefore, you should plan your configuration changes ahead of time, make all of your changes at once, and then re-index your domain. The same applies when setting up a new domain - plan your configuration before you set it up so that you can index only once and get up and running in the shortest time possible.
Some domain changes require re-indexing while others just require re-deploying the existing index. Redeploying the domain takes 10 to 15 minutes compared to 30-60 minutes for re-indexing. During re-deployment, CloudSearch creates new nodes, deploys the index on them, and shuts down the old nodes. Your domain status changes to “Processing” during re-deployment. When re-indexing is needed, your domain status changes to “Needs Indexing,” followed by “Processing” once you have initiated indexing. Once the new index is created, your domain is re-deployed. The following table summarizes which changes require re-indexing followed by re-deployment and which changes require just re-deployment. Understanding this will help you better plan your configuration changes.
Change
Needs re-indexing
Needs re-deployment
Multi-AZ No
Yes
Index fields
Yes
Yes
Index field options
Yes Yes
Instance type
Yes Yes
Partition count
Yes Yes
Replication count
No Yes
Suggesters
Yes Yes
Expressions
No Yes
Analysis schemes
Yes Yes
What search features does Amazon CloudSearch provide?
Search Features
Amazon CloudSearch | Analytics
Amazon CloudSearch provides features to index and search both structured data and plain text, including faceted search, free text search, Boolean search expressions, customizable relevance ranking, query time rank expressions, field weighting, searching and sorting of results using any field, and text processing options including tokenization, stopwords, stemming and synonyms. It also provides near real-time indexing for document updates. New features include:
Autocomplete suggestions
Highlighting
Geospatial search
New data types: date, double, 64 bit signed int, LatLon
Dynamic fields
Index field statistics
Sloppy phrase search
Term boosting
Enhanced range searching for all field types
Search filters that don’t affect relevance
Support for multiple query parsers: simple, structured, lucene, dismax
Query parser configuration options
What is faceting?
Search Features
Amazon CloudSearch | Analytics
Faceting allows you to categorize your search results into refinements on which the user can further search. For example, a user might search for “umbrellas”, and facets allow you to group the results by price, such as $0-$10, $10-$20, $20-$40, and so on. Amazon CloudSearch also allows for result counts to be included in facets, so that each refinement has a count of the number of documents in that group. The example could then be: $0-$10 (4 items), $10-$20 (123 items), $20-$40 (57 items), and so on.
What languages does Amazon CloudSearch support?
Search Features
Amazon CloudSearch | Analytics
Amazon CloudSearch currently supports 34 languages: Arabic (ar), Armenian (hy), Basque (eu), Bulgarian (bg), Catalan (ca), simplified Chinese (zh-Simp), traditional Chinese (zh-Trad), Czech (cs), Danish (da), Dutch (nl), English (en), Finnish (fi), French (fr), Galician (gl), German (de), Greek (el), Hebrew (he), Hindi (hi), Hungarian (hu), Indonesian (id), Irish (ga), Italian (it), Japanese (ja), Korean (ko), Latvian (la), Norwegian (no), Persian (fa), Portuguese (pt), Romanian (ro), Russian (ru), Spanish (es), Swedish (sv), Thai (th), and Turkish (tr). In addition, Amazon CloudSearch supports a Multiple (mul) option for fields that contain mixed languages.