rsync between two years

classic Classic list List threaded Threaded
37 messages Options
12
Reply | Threaded
Open this post in threaded view
|

rsync between two years

jdd@dodin.org
Hello,

The web is filled with rsync examples, but I found none that fit my
needs an find and rsync options are really hard to manage.

I have a tree of various files I use frequently, that get larger each year.

I want to archive them by year or range of years (example from start to
2010, from 2011 to 2012... at the end nothing must be ommitted

I have a lot of 32Gb sd cards and the whole file shouldn't be more than
3 or 4 cards.

What is the best way to do this?

I would like to use the --delete option, because I may remove old files
on source

thanks
jdd
--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Bernhard Voelker
On 01/11/2018 07:13 PM, [hidden email] wrote:

> Hello,
>
> The web is filled with rsync examples, but I found none that fit my
> needs an find and rsync options are really hard to manage.
>
> I have a tree of various files I use frequently, that get larger each year.
>
> I want to archive them by year or range of years (example from start to
> 2010, from 2011 to 2012... at the end nothing must be ommitted
>
> I have a lot of 32Gb sd cards and the whole file shouldn't be more than
> 3 or 4 cards.
>
> What is the best way to do this?

TBH this sounds more like a one-time moving task rather than a regular sync.
Of course, you can use rsync as part of the solution, but it would be
interesting to understand more what you want to do.

Do you have a random, nested directory structure like

  dir/a/file-from-2010
  dir/a/file-from-2012
  dir/a/x/file-from-2013
  dir/b/c/file-from-2014
  dir/file-from-2011
  ...

and want to archive the files based on their modification timestamp?

> I would like to use the --delete option, because I may remove old files
> on source

The --delete option does not delete on SRC side, but would remove files/directories
on the DST side if the file does not exist on SRC side.

Have a nice day,
Berny

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Carlos E. R.-2
In reply to this post by jdd@dodin.org
On 2018-01-11 19:13, [hidden email] wrote:

> Hello,
>
> The web is filled with rsync examples, but I found none that fit my
> needs an find and rsync options are really hard to manage.
>
> I have a tree of various files I use frequently, that get larger each year.
>
> I want to archive them by year or range of years (example from start to
> 2010, from 2011 to 2012... at the end nothing must be ommitted
>
> I have a lot of 32Gb sd cards and the whole file shouldn't be more than
> 3 or 4 cards.
I don't know that rsync can split a file across several cards.


--
Cheers / Saludos,

                Carlos E. R.
                (from 42.2 x86_64 "Malachite" at Telcontar)


signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
Le 11/01/2018 à 19:37, Carlos E. R. a écrit :

> I don't know that rsync can split a file across several cards.
>
>
not a file, but files grouped (I'm a bit short of English vocabulary :-()

jdd

--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
In reply to this post by Bernhard Voelker
Le 11/01/2018 à 19:24, Bernhard Voelker a écrit :

> TBH this sounds more like a one-time moving task rather than a regular
> sync.

only in part. I often notice than some files are duplicated so I can
remove them without loss, I try to keep the same on archives

> Of course, you can use rsync as part of the solution, but it would be
> interesting to understand more what you want to do.

split the source by dates, to make browsing the archives easier

>
> Do you have a random, nested directory structure like
>
>   dir/a/file-from-2010
>   dir/a/file-from-2012
>   dir/a/x/file-from-2013
>   dir/b/c/file-from-2014
>   dir/file-from-2011

no, not there. I have a nested by date structure for the real
"collections" (photos, videos), but not for the day to day files that
are more fitted by other classification

>> I would like to use the --delete option, because I may remove old files
>> on source
>
> The --delete option does not delete on SRC side, but would remove
> files/directories
> on the DST side if the file does not exist on SRC side.

exactly. I often find duplicates an want to remove the from all archives
(I have several copies of the archives, some older than others)

the folder I want to archive I name "mes-documents" (my-docs in french)
and I drop them where I need them.

I would like to use rsync to sort them by date while keeping the
directory structure.

Say, I have somewhere a folder "taxes" and I want to archive them as
taxes from 2000 to 2010 for example

jdd


--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Carlos E. R.-2
In reply to this post by jdd@dodin.org
On 2018-01-11 19:42, [hidden email] wrote:
> Le 11/01/2018 à 19:37, Carlos E. R. a écrit :
>
>> I don't know that rsync can split a file across several cards.
>>
>>
> not a file, but files grouped (I'm a bit short of English vocabulary :-()

Even so, rsync can not copy to several cards distributing the files. It
does a single operation to a single destination.

--
Cheers / Saludos,

                Carlos E. R.
                (from 42.2 x86_64 "Malachite" at Telcontar)


signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Carlos E. R.-2
In reply to this post by jdd@dodin.org
On 2018-01-11 19:51, [hidden email] wrote:
> Le 11/01/2018 à 19:24, Bernhard Voelker a écrit :


>>> I would like to use the --delete option, because I may remove old files
>>> on source
>>
>> The --delete option does not delete on SRC side, but would remove
>> files/directories
>> on the DST side if the file does not exist on SRC side.
>
> exactly. I often find duplicates an want to remove the from all archives
> (I have several copies of the archives, some older than others)

I don't think you understand the --delete action.

The intention is that you do a copy, say from a directory to a memory
card. Some days later you repeat the same operation: same directory and
same card. Well, the --delete option will delete on the backup the files
that are no longer in the original.


Also, if you make a mistake and put the wrong memory card (a backup of
another directory), it will delete the complete card.

For example, you backup "photos" to card A, and "taxes" to card B. If
another day you use rsync with --delete of "photos" to card B, then card
B is deleted and you lose the backup of the taxes.

So be very careful with --delete. Happened to me.

>
> the folder I want to archive I name "mes-documents" (my-docs in french)
> and I drop them where I need them.
>
> I would like to use rsync to sort them by date while keeping the
> directory structure.
>
> Say, I have somewhere a folder "taxes" and I want to archive them as
> taxes from 2000 to 2010 for example

I don't see how to sort by date with rsync, but perhaps.

Me, I would perhaps create a secondary tree of hardlinks of the files
sorted by date, then backup that. Hard links do not use extra space.

--
Cheers / Saludos,

                Carlos E. R.
                (from 42.2 x86_64 "Malachite" at Telcontar)


signature.asc (188 bytes) Download Attachment
Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Bernhard Voelker
In reply to this post by jdd@dodin.org
On 01/11/2018 07:51 PM, [hidden email] wrote:

>> Do you have a random, nested directory structure like
>>
>>    dir/a/file-from-2010
>>    dir/a/file-from-2012
>>    dir/a/x/file-from-2013
>>    dir/b/c/file-from-2014
>>    dir/file-from-2011
>
> no, not there. I have a nested by date structure for the real
> "collections" (photos, videos), but not for the day to day files that
> are more fitted by other classification

you missed the 2nd half of the question:
do you want to archive based on the timestamps of the files, or is the
date part of the folders you're putting the files in?

In the former case (timestamps), the solution may be a combination of
of 'find' and 'rsync' (or 'mv'/'cp', depends); e.g. the following find
the files with modification time in the year 2012:

   $ find . -newermt 2012-01-01 ! -newermt 2013-01-01 -ls

So you could use that to move all those files into a 2012 folder
(untested!):

   $ find . -type f -newermt 2012-01-01 ! -newermt 2013-01-01 -print0 \
       | rsync -HAXaxi --from0 --files-from=- . /path2/archive-2012/.

If you want to remove the files from the source directory, then I'd
redirect the 'find' output to a file, and use it once for rsync and
once for 'xargs -0 rm' as input.

Have a nice day,
Berny



--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
In reply to this post by Carlos E. R.-2
Le 11/01/2018 à 23:41, Carlos E. R. a écrit :

> Even so, rsync can not copy to several cards distributing the files. It
> does a single operation to a single destination.
>
it is possible to copy only files within some criteria, using find, for
example:

https://unix.stackexchange.com/questions/87018/find-and-rsync

but I'm short of syntax details for using years as criteria

thanks
jd

--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
In reply to this post by Carlos E. R.-2
Le 11/01/2018 à 23:52, Carlos E. R. a écrit :

> I don't think you understand the --delete action.

yes I do, I use it all the time

> Me, I would perhaps create a secondary tree of hardlinks of the files
> sorted by date, then backup that. Hard links do not use extra space.
>
hard link on other device??

jdd

--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Aaron Digulla
In reply to this post by jdd@dodin.org
Am Donnerstag, Januar 11, 2018 19:13 CET, "[hidden email]" <[hidden email]> schrieb:

> Hello,
>
> The web is filled with rsync examples, but I found none that fit my
> needs an find and rsync options are really hard to manage.
>
> I have a tree of various files I use frequently, that get larger each year.
>
> I want to archive them by year or range of years (example from start to
> 2010, from 2011 to 2012... at the end nothing must be ommitted

rsync can't do that. It has no options to select files by date alone.

Try to create a second folder structure using find(1) and the option "-newermt" (see https://unix.stackexchange.com/questions/73268/how-to-move-the-files-based-on-year for an example how to use a date range).

Use hardlinks (ln(1) without -s) to "create" the files in the per-year directory structure, for example:

    -exec ln "{}" "/path/to/backup/2012/{}" \;

Afterwards, you can use "find . -type f | sort > files.lst" on both folder structures and compare the two outputs to make sure you didn't miss anything. If the modification dates/permissions are different, use -printf to just include only name and file size.

When you're sure, you can backup each year. Afterwards, you can use the find(1) commands which you used to create the links and replace -exec... with -delete to get rid of the saved files.

The folders with the hardlinks, you can simply delete.

Regards,


--
Aaron "Optimizer" Digulla a.k.a. Philmann Dark
"It's not the universe that's limited, it's our imagination.
Follow me and I'll show you something beyond the limits."
http://blog.pdark.de/

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
In reply to this post by Bernhard Voelker
Le 12/01/2018 à 00:46, Bernhard Voelker a écrit :

> do you want to archive based on the timestamps of the files,

this. I want to keep the original directory tree, but only with the
files of the given dates

> So you could use that to move all those files into a 2012 folder
> (untested!):
>
>    $ find . -type f -newermt 2012-01-01 ! -newermt 2013-01-01 -print0 \
>        | rsync -HAXaxi --from0 --files-from=- . /path2/archive-2012/.
>

It's nearly what I expected, thanks.

I use the (mounted) path of the sd card as destination, works as root
(to preserve permissions) and can omit the first -newer for a simple
"from date". Use of ! on the second newer is very smart.

however, I don't fully understand how it works. Specially I guess find
send the file names one at a time, but do rsync take them one by one,
copying them one at a time, or as a list, waiting to the list to be
complete to begin writing. In other words, will the option --delete works?

I'm also surprised to see the "dot", I was thinking of the files as
source and if I get it well, file list is only a test on the "." source
that need to be set.

the "-i" output is not that easy to understand neither

anyway, I wouldn't have got that command line without help, thanks a lot!

jdd
--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Carlos E. R.-2
In reply to this post by jdd@dodin.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On Friday, 2018-01-12 at 09:02 +0100, [hidden email] wrote:

> Le 11/01/2018 à 23:52, Carlos E. R. a écrit :
>
>>  I don't think you understand the --delete action.
>
> yes I do, I use it all the time
>
>>  Me, I would perhaps create a secondary tree of hardlinks of the files
>>  sorted by date, then backup that. Hard links do not use extra space.
>>
> hard link on other device??

No, on the source device. You create a secondary tree built of hardlinks
to the original tree, in the order that you want to do the backup. Then do
the backup of that one.

I use this when I want to have to different sort methods fo the same bunch
of files.

- --
Cheers,
        Carlos E. R.
        (from openSUSE 42.2 x86_64 "Malachite" at Telcontar)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlpYuoIACgkQtTMYHG2NR9XVrQCfVLD6csekUV8Kw+m8lZPI9pIv
yc4An1CJllxWQGIiP8oC92NGJfPSCq00
=ZyR5
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Carlos E. R.-2
In reply to this post by jdd@dodin.org
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1



On Friday, 2018-01-12 at 08:59 +0100, [hidden email] wrote:

> Le 11/01/2018 à 23:41, Carlos E. R. a écrit :
>
>>  Even so, rsync can not copy to several cards distributing the files. It
>>  does a single operation to a single destination.
>>
> it is possible to copy only files within some criteria, using find, for
> example:
>
> https://unix.stackexchange.com/questions/87018/find-and-rsync
>
> but I'm short of syntax details for using years as criteria
I'd have to think about it. The idea is to run a single rsync command for
a large bunch of files, or it becomes inefficient.

- --
Cheers,
        Carlos E. R.
        (from openSUSE 42.2 x86_64 "Malachite" at Telcontar)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2

iEYEARECAAYFAlpYuucACgkQtTMYHG2NR9UmRACffVNbDK8io7lNGHEXwZjVWQde
tMIAnRKRXRuOmDnsi7rw2XGnecGnq4in
=hpZM
-----END PGP SIGNATURE-----
Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Bernhard Voelker
In reply to this post by jdd@dodin.org
On 01/12/2018 12:16 PM, [hidden email] wrote:

> Le 12/01/2018 à 00:46, Bernhard Voelker a écrit :
>> So you could use that to move all those files into a 2012 folder
>> (untested!):
>>
>>    $ find . -type f -newermt 2012-01-01 ! -newermt 2013-01-01 -print0 \
>>        | rsync -HAXaxi --from0 --files-from=- . /path2/archive-2012/.
>>
>
> It's nearly what I expected, thanks.
>
> I use the (mounted) path of the sd card as destination, works as root (to preserve permissions) and can omit the first -newer for a simple "from date". Use of ! on the second newer is very smart.
>
> however, I don't fully understand how it works.

you can set up a test directory hierarchy:

  $ mkdir -p src/a/b src/c
  $ touch -d 2012-06-06 src/file-from-2012
  $ touch -d 2013-06-06 src/a/file-from-2013
  $ touch -d 2014-06-06 src/a/b/file-from-2014

Now you have:

  $ find -type f -exec ls -log '{}' +
  -rw-r--r-- 1 0 Jun  6  2014 ./src/a/b/file-from-2014
  -rw-r--r-- 1 0 Jun  6  2013 ./src/a/file-from-2013
  -rw-r--r-- 1 0 Jun  6  2012 ./src/file-from-2012

Selecting with only -newermt for files newer than Jan 1st of 2013
gives also the file from 2014:

  $ find -type f -newermt 2013-01-01 -exec ls -log '{}' +
  -rw-r--r-- 1 0 Jun  6  2014 ./src/a/b/file-from-2014
  -rw-r--r-- 1 0 Jun  6  2013 ./src/a/file-from-2013

Adding the "! -newermt" (spoken "not newer than") will find only
the files from 2013:

  $ find -type f -newermt 2013-01-01 ! -newermt 2014-01-01 -exec ls -log '{}' +
  -rw-r--r-- 1 0 Jun  6  2013 ./src/a/file-from-2013

> Specially I guess find send the file names one at a time, but do rsync take them one by one, copying them one at a time, or as a list, waiting to the
> list to be complete to begin writing.

yes, find passes the name of the files one-by-one to rsync, and that will start the sync
as early the names are available on the pipe.

> In other words, will the option --delete works?

NO, --delete is operating on the DESTINATION.  As far as I understood you
would like to remove all the archived files from the SOURCE, right?

As mentioned in my previous mail, I'd save the output of find in
a file, and then pass that file
a) to rsync for archiving, and then
b) to "xargs rm" to delete it in the SOURCE directory.

1. Get file names:
  $ find . -type f -newermt 2012-01-01 ! -newermt 2013-01-01 -print0 \
      > list

2. Archive the files:

  $ rsync -HAXaxi --from0 --files-from=- . /path2/archive-2012/. \
      < list

3. Remove the files from the SOURCE:

  $ xargs -0 rm -v < list

> I'm also surprised to see the "dot", I was thinking of the files as source and if I get it well, file list is only a test on the "." source that need to be set.

Well, I didn't find a proper example with the --files-from option
in the man page, so usin "." was just guesswork. ;-)

> the "-i" output is not that easy to understand neither

you could change that to '-v' if you are more familiar with it.

> anyway, I wouldn't have got that command line without help, thanks a lot!

no worries!

Have a nice day,
Berny

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
Le 12/01/2018 à 17:36, Bernhard Voelker a écrit :

> Adding the "! -newermt" (spoken "not newer than") will find only
> the files from 2013,
yes I was only wondering if removing "!" makes a second set with all
what is not on the first, not to forget something in between

>> In other words, will the option --delete works?
>
> NO, --delete is operating on the DESTINATION.  As far as I understood you
> would like to remove all the archived files from the SOURCE, right?

wrong. If I happen to remove a file from the source, I want this to be
also done on the copy

If I first copy the file list in a file and give it as argument for
--files-from, will --delete works? else I need an other line to first
remove the extra files

>> I'm also surprised to see the "dot", I was thinking of the files as source and if I get it well, file list is only a test on the "." source that need to be set.
>
> Well, I didn't find a proper example with the --files-from option
> in the man page, so using "." was just guesswork. ;-)

good guess :-)

>
>> the "-i" output is not that easy to understand neither
>
> you could change that to '-v' if you are more familiar with it.

I tried, but it's less informative

thanks
jdd

--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
Le 12/01/2018 à 18:43, [hidden email] a écrit :

> If I first copy the file list in a file and give it as argument for
> --files-from, will --delete works?


tested, no it don't works. rsync takes the files in a from-file one at a
time.

presently, the only thing I find is to blank completely the target and
then, using rsync is pointless :-( (may be it's a better cp)

or is there a way to:

* compare the file list on source and target
* remove files on target that where previously removed on source

the reason is that I have from time to time to clean this folder tree
and I don't want to have to clean it again if I have to recover from the
archives

thanks
jdd



--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Bernhard Voelker
In reply to this post by jdd@dodin.org
On 01/12/2018 06:43 PM, [hidden email] wrote:

> Le 12/01/2018 à 17:36, Bernhard Voelker a écrit :
>>> In other words, will the option --delete works?
>>
>> NO, --delete is operating on the DESTINATION.  As far as I understood you
>> would like to remove all the archived files from the SOURCE, right?
>
> wrong. If I happen to remove a file from the source, I want this to be
> also done on the copy
>
> If I first copy the file list in a file and give it as argument for
> --files-from, will --delete works? else I need an other line to first
> remove the extra files

sorry, I'm lost.
The --delete option clearly operates on the destination:

   $ rsync --help | grep -F ' --delete '
      --delete                delete extraneous files from destination dirs

For further discussion, please provide simple example directory hierarchies
for SRC and DST (like I did), jow it should look like before and after a
sync/archive/copy/delete (or whatever you want to achieve) so we can
understand what you want/need.

Have a nice day,
Berny

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

Bernhard Voelker
In reply to this post by jdd@dodin.org


On 01/12/2018 07:17 PM, [hidden email] wrote:
> Le 12/01/2018 à 18:43, [hidden email] a écrit :
>
>> If I first copy the file list in a file and give it as argument for
>> --files-from, will --delete works?
>
>
> tested, no it don't works. rsync takes the files in a from-file one at a
> time.

What else would you expect?  It gets a list of files, and processes them
one after another.  What's wrong with that?  Sorry, I'm lost again.

> presently, the only thing I find is to blank completely the target and
> then, using rsync is pointless :-( (may be it's a better cp)

Why erase from the target?

> or is there a way to:
>
> * compare the file list on source and target

rsync *does* compare the files on source and target; well, for performance
reasons, it compares file size and timestamps only, but you could use
the -c option to compare the full content.

> * remove files on target that where previously removed on source

ah, that's what the --delete option is for: if file A exists in the
destination but no longer in source, then it deletes A in the destination.

   $ mkdir src dst

   $ touch dst/myfile

   $ rsync -avx --delete src/. dst/.
   sending incremental file list
   deleting myfile
   ...

Have a nice day,
Berny


--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

Reply | Threaded
Open this post in threaded view
|

Re: rsync between two years

jdd@dodin.org
In reply to this post by Bernhard Voelker
Le 13/01/2018 à 00:03, Bernhard Voelker a écrit :

> sorry, I'm lost.
> The --delete option clearly operates on the destination:

of course

if, for example, I delete cache files from source I want them to be also
deleted from destination where they where copied previously

> For further discussion, please provide simple example directory hierarchies
> for SRC and DST (like I did),

can be anything, but same on source and destination, restricted to the dates

it's because it's random than I need all this

thanks
jdd
(NB: I will be offline for the next 72h)


--
http://dodin.org

--
To unsubscribe, e-mail: [hidden email]
To contact the owner, e-mail: [hidden email]

12