Proposed behavioral change for Baloo indexing

classic Classic list List threaded Threaded
7 messages Options
Reply | Threaded
Open this post in threaded view
|

Proposed behavioral change for Baloo indexing

Luca Beltrame
The Baloo indexer shipped with Plasma does its job in two passes: first the file
names are indexed (fast) then metadata and content are extracted (slow).

This second pass can be problematic and it's highly user-dependent, because
the various metadata extractors have varying degrees of efficiency. When there
is a problem with metadata extraction, it's often the cause of the high CPU
usage and disk I/O reported.

One proposal we had in IRC was to change the behavior of the indexer (which is
enabled by default): switch it to "basic indexing" which only indexes the
names and not the metadata, as a solution for the above problem. There's a
(config-only) option that does the job.

Due to a bug it was not possible to change that configuration and it was never
saved to disk. We also plan to add a checkbox to toggle the option on and off
in the configuration dialog (System Settings > Search) for those who use
metadata search.

However it's a behavioral change: anyone who uses the default configuration
will be affected by this. Hence this post to gather feedback. Should we do
this, or not?

NOTE: This is about the *default*. If you have disabled Baloo nothing will
happen.

--
Luca Beltrame - KDE Forums team
GPG key ID: A29D259B

signature.asc (499 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Pablo Sanchez-2
[ Comments below, in-line ]

On Sat, 08 Apr 2017 15:15:19 +0200, Luca Beltrame wrote:
>
> [ trimmed ]
>
> However it's a behavioral change: anyone who uses the default
> configuration will be affected by this. Hence this post to gather
> feedback. Should we do this, or not?

Hi Luca,

I don't use Baloo so my point below may be off the mark.

I'd go with the recommendation of disabling 'metadata and content
extraction'

Ideally, when performing a Baloo search, it would be great to provide
a message to the user that 'metadata and content extraction' is not
enabled.  They can check a box to 'never see this message box' again.

Cheers,
--
Pablo Sanchez - Blueoak Database Engineering, Inc
Ph:    819.459.1926        iNum:  883.5100.0990.1054

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Bruno Friedmann-2
On samedi, 8 avril 2017 15.36:09 h CEST Pablo Sanchez wrote:

> [ Comments below, in-line ]
>
> On Sat, 08 Apr 2017 15:15:19 +0200, Luca Beltrame wrote:
> > [ trimmed ]
> >
> > However it's a behavioral change: anyone who uses the default
> > configuration will be affected by this. Hence this post to gather
> > feedback. Should we do this, or not?
>
> Hi Luca,
>
> I don't use Baloo so my point below may be off the mark.
>
> I'd go with the recommendation of disabling 'metadata and content
> extraction'
>
> Ideally, when performing a Baloo search, it would be great to provide
> a message to the user that 'metadata and content extraction' is not
> enabled.  They can check a box to 'never see this message box' again.
>
> Cheers,
> --
> Pablo Sanchez - Blueoak Database Engineering, Inc
> Ph:    819.459.1926        iNum:  883.5100.0990.1054

Making baloo light on first usage, so newcomer get impressed is on my POV the
right decision to take.
If the checkbox in systemsettings is there and work as expected once checked
cool (also allowing a prefered time range like during the night would be a
nice option).
As Pablo idea is really top notch, a warning on you're not getting metadata
and content results fix it message would really cover all style of usage.

Now Lucas you didn't say a word for those who have already indexed filenames
and content, with the default changes, what will happen ?
Should I goes to systemsettings and reactivate the metadata, content
indexation ?

Thanks for taking care of us ;-)

--

Bruno Friedmann
 Ioda-Net Sàrl www.ioda-net.ch
 Bareos Partner, openSUSE Member, fsfe fellowship
 GPG KEY : D5C9B751C4653227
 irc: tigerfoot

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Cor Blom
In reply to this post by Luca Beltrame
Op 08-04-17 om 15:15 schreef Luca Beltrame:

> The Baloo indexer shipped with Plasma does its job in two passes: first the file
> names are indexed (fast) then metadata and content are extracted (slow).
>
> This second pass can be problematic and it's highly user-dependent, because
> the various metadata extractors have varying degrees of efficiency. When there
> is a problem with metadata extraction, it's often the cause of the high CPU
> usage and disk I/O reported.
>
> One proposal we had in IRC was to change the behavior of the indexer (which is
> enabled by default): switch it to "basic indexing" which only indexes the
> names and not the metadata, as a solution for the above problem. There's a
> (config-only) option that does the job.
>
> Due to a bug it was not possible to change that configuration and it was never
> saved to disk. We also plan to add a checkbox to toggle the option on and off
> in the configuration dialog (System Settings > Search) for those who use
> metadata search.
>
> However it's a behavioral change: anyone who uses the default configuration
> will be affected by this. Hence this post to gather feedback. Should we do
> this, or not?
>
> NOTE: This is about the *default*. If you have disabled Baloo nothing will
> happen.
>

For me the metadata extraction never really worked without problems, so
I've disabled baloo. I would welcome having file indexing only and that
is also the only part I use and need. I did not know there was a config
option, so having a checkbox is an improvement. (Or I knew and forgot
again).

Changing the default would, in my opinion, depend on the question
whether someone then looses the metadata index. Does it need to be done
again after reenabling through the checkbox and how much time will that
indexing take? If it needs to be recreated and it takes a lot of time
(because someone has a lot of files), I would advise against a change.

Thanks,

Cor

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Achim Gratz
In reply to this post by Luca Beltrame
Luca Beltrame writes:
> One proposal we had in IRC was to change the behavior of the indexer (which is
> enabled by default): switch it to "basic indexing" which only indexes the
> names and not the metadata, as a solution for the above problem. There's a
> (config-only) option that does the job.

This or any other indexer should be off by default.  Then, when a user
uses a search function that needs the index, ask about activating it.


Regards,
Achim.
--
+<[Q+ Matrix-12 WAVE#46+305 Neuron microQkb Andromeda XTk Blofeld]>+

Wavetables for the Terratec KOMPLEXER:
http://Synth.Stromeko.net/Downloads.html#KomplexerWaves

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Luca Beltrame
Il 8 aprile 2017 18:36:52 CEST, Achim Gratz <[hidden email]> ha scritto:
>Luca Beltrame writes:
 
>This or any other indexer should be off by default.  Then, when a user
>uses a search function that needs the index, ask about activating it.

This is not what this discussion is about.

--
Luca Beltrame
GPG key ID: 6E1A4E79
--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: Proposed behavioral change for Baloo indexing

Luca Beltrame
In reply to this post by Bruno Friedmann-2
In data sabato 8 aprile 2017 17:41:09 CEST, Bruno Friedmann ha scritto:

> Now Lucas you didn't say a word for those who have already indexed filenames
> and content, with the default changes, what will happen ?

From what I know, *existing* stuff should be preserved. Newer files would not
get their metadata indexed, of course.

> Should I goes to systemsettings and reactivate the metadata, content
> indexation ?

This would only affect new files: I just need to check if any changes in the
KCM would wipe the index and restart from scratch (I know this happens if you
change folders to get indexed, because it's cheaper to start from scratch than
look through all the index to remove entries).

--
Luca Beltrame - KDE Forums team
GPG key ID: A29D259B

signature.asc (499 bytes) Download Attachment