With datasources growing and growing, some by 20,000 files a day, the problem of Full Text Indexing keeping all of the documents indexed and keeping them up to date has become an ever increasing problem. Some users have been stuck over 1 million documents un-indexed for a while now with no help in site. After some testing we have a solution for this, with the links below users have successfully indexed 1 million un-indexed documents in a weeks time. The big caveat to this is that this WILL NOT work unless the prerequisites are followed and please note not everyone will see the expected results that some other users may see. The hardware of the machine is the limiting factor, so like I have said please do not attempt this unless you can meet the prerequisites.
First we need to understand how the Automated File Processing numbers are generated here:
The first optimization link is to add in another set of processors to increase processing production:
The second link is how to properly adjust your processing configuration numbers, this deals with numbers sent to the queue, retry time and time to check for documents. This process is ever growing that will require you to monitor your system until you find the right numbers for you:
Also if you need to push your index off on another drive, which i would recommend because the index will not be fighting for IO with the operating system, please follow this:
And last but not least, if you are running in a cluster, this blog needs to be followed so that your users can properly search:
If you have any further questions on these links please let us know in TSG by either submitting a service request or reply to this blog and I will answer questions here.