René's URL Explorer Experiment


Title: tagpix - User Guide

Mail addresses
lutz@learning-python.com

direct link

Domain: learning-python.com

Nonetext/html; charset=UTF-8

Links:

https://learning-python.com/tagpix/screenshots/index.html
cautionhttps://learning-python.com/tagpix/UserGuide.html#Usage Caution
main scripthttps://learning-python.com/tagpix/tagpix.py
folderhttps://learning-python.com/tagpix/
web pagehttps://learning-python.com/tagpix.html
Overviewhttps://learning-python.com/tagpix/UserGuide.html#Overview
Usage Detailshttps://learning-python.com/tagpix/UserGuide.html#Usage Details
Installs and Platformshttps://learning-python.com/tagpix/UserGuide.html#Installs and Platforms
Input Promptshttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
Results Reporthttps://learning-python.com/tagpix/UserGuide.html#Results Report
Results Treehttps://learning-python.com/tagpix/UserGuide.html#Results Tree
Resolving Skipshttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
Usage Modeshttps://learning-python.com/tagpix/UserGuide.html#Usage Modes
Other Usage Noteshttps://learning-python.com/tagpix/UserGuide.html#Other Usage Notes
Recent Changeshttps://learning-python.com/tagpix/UserGuide.html#Recent Changes
Version 2.3: Silence Pillow DOS Warninghttps://learning-python.com/tagpix/UserGuide.html#Version 2.3
Version 2.2: Use and Drop Android Dateshttps://learning-python.com/tagpix/UserGuide.html#Version 2.2
Version 2.1: Multiple Enhancementshttps://learning-python.com/tagpix/UserGuide.html#Version 2.1
Version 2.0: Numerous Upgradeshttps://learning-python.com/tagpix/UserGuide.html#Version 2.0
Usage Cautionhttps://learning-python.com/tagpix/UserGuide.html#Usage Caution
folderhttps://learning-python.com/tagpix/screenshots/results-flat.png
subfoldershttps://learning-python.com/tagpix/screenshots/results-groupedall.png
tagshttps://en.wikipedia.org/wiki/Exif
filenameshttps://learning-python.com/tagpix/UserGuide.html#Version 2.2
version 2.1https://learning-python.com/tagpix/UserGuide.html#21duplicates
typehttps://en.wikipedia.org/wiki/Media_type
extensionhttps://en.wikipedia.org/wiki/Filename_extension
subfoldershttps://learning-python.com/tagpix/screenshots/results-2.2-android.png
rerunhttps://learning-python.com/tagpix/UserGuide.html#reruns
Androidhttps://learning-python.com/tagpix/UserGuide.html#Version 2.2
list-onlyhttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
reporthttps://learning-python.com/tagpix/UserGuide.html#Results Report
learning-python.com/tagpix.htmlhttps://learning-python.com/tagpix.html
www.python.org/downloads/https://www.python.org/downloads/
pypi.python.org/pypi/Pillowhttps://pypi.python.org/pypi/Pillow
this pagehttps://learning-python.com/tagpix/README-macapp.html
exif.pyhttps://duckduckgo.com/?q=%22exif.py
domainhttps://duckduckgo.com/?q=python+exif
shothttps://learning-python.com/tagpix/screenshots/tx-run-on-android-1.jpg
loghttps://learning-python.com/tagpix/examples/tx-run-on-android-log.txt
Termuxhttps://termux.com/
this commandhttps://pillow.readthedocs.io/en/stable/installation.html#building-on-android
Pydroid 3https://play.google.com/store/apps/details?id=ru.iiec.pydroid3
this dochttps://learning-python.com/mergeall-android-scripts/_README.html#toc9
Pythonistahttps://apps.apple.com/us/app/pythonista-3/id1085978097
tagpix.pyhttps://learning-python.com/tagpix/tagpix.py
PyEdithttp://learning-python.com/pyedit.html
herehttps://learning-python.com/tagpix/screenshots/tagpix-run-in-pyedit.png
aheadhttps://learning-python.com/tagpix/UserGuide.html#Results Report
copy). You can either enter an explicit folder, or press Enter to accept the default: To use an explicit folder, enter the pathname of the root folder containing all the photo subfolders you wish to combine. For example, you might give the root folder just above those where you store photos from your camera cards, copies, or imports. If you prefer to use the default, it is the SOURCE folder in the current working directory (e.g., in the script's own directory, if run from the same). Move or copy all your camera folders and images to there before running this script. Whether the source folder is explicit or default, all its content and subfolders will be scanned to collect all items in the entire source-folder tree. Per transfer-modes coverage ahead, the source folder will either be emptied or left intact after a tagpix run, according to your configurations. For #3 (the destination): This is where images are moved (or copied) to—the folder containing the result's MERGED folder (described in more detail ahead). You can either enter an explicit folder, or press Enter to accept the default: To use an explicit folder, enter the pathname of the folder to which you wish tagpix to transfer your merged source items. Result folders will be created there automatically as needed. If you prefer to use the default, it is the current working directory (e.g., results appear in MERGED within the script's own directory, if run from the same). Move or copy the result folders from there after running this script. Whether the destination folder is explicit or default, its MERGED subfolder will hold all your combined source-tree items after the tagpix run. Per usage-modes coverage ahead, if you enter a prior run's folder at this prompt, it will be extended; if you enter a new folder, it will be generated. https://learning-python.com/tagpix/UserGuide.html#21copymodes
aheadhttps://learning-python.com/tagpix/UserGuide.html#21copymodes
copied) to—the folder containing the result's MERGED folder (described in more detail ahead). You can either enter an explicit folder, or press Enter to accept the default: To use an explicit folder, enter the pathname of the folder to which you wish tagpix to transfer your merged source items. Result folders will be created there automatically as needed. If you prefer to use the default, it is the current working directory (e.g., results appear in MERGED within the script's own directory, if run from the same). Move or copy the result folders from there after running this script. Whether the destination folder is explicit or default, its MERGED subfolder will hold all your combined source-tree items after the tagpix run. Per usage-modes coverage ahead, if you enter a prior run's folder at this prompt, it will be extended; if you enter a new folder, it will be generated. https://learning-python.com/tagpix/UserGuide.html#21copymodes
aheadhttps://learning-python.com/tagpix/UserGuide.html#Results Tree
aheadhttps://learning-python.com/tagpix/UserGuide.html#Usage Modes
aheadhttps://learning-python.com/tagpix/UserGuide.html#Results Tree
herehttps://learning-python.com/tagpix/UserGuide.html#Usage Modes
herehttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
2.1https://learning-python.com/tagpix/UserGuide.html#21deleteverify
exampleshttps://learning-python.com/tagpix/examples/
this examplehttps://learning-python.com/tagpix/examples/large-13k-photo-extending.txt
streamhttps://en.wikipedia.org/wiki/Standard_streams
herehttps://learning-python.com/tagpix/examples/AUTOMATED-INPUTS/tagpix-listonly.sh
herehttps://learning-python.com/tagpix/examples/AUTOMATED-INPUTS/tagpix-merge.sh
PyEdithttp://learning-python.com/pyedit.html
promptshttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
herehttps://en.wikipedia.org/wiki/Standard_streams
earlierhttps://learning-python.com/tagpix/UserGuide.html#automatedinputs
aheadhttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
folderhttps://learning-python.com/tagpix/examples/
this filehttps://learning-python.com/tagpix/examples/large-13k-photo-run-report.txt
prompthttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
prompthttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
earlierhttps://learning-python.com/tagpix/UserGuide.html#Overview
copied) to OTHERS. After a tagpix run, you may wish to manually remove items from OTHERS that reflect camera-specific cruft. For example, some cameras create .THM or .CTG files which are irrelevant to your content in PHOTOS and MOVIES. tagpix does not omit these automatically, because it prefers to err on the side of caution (only well-known .* hidden files and user-selected subfolders are skipped, per the next section). Be sure to delete only cruft: the OTHERS result folder may contain non-camera images like PNGs and GIFs too. For a more graphical look at results trees, see the examples folder's screenshots of both flat and group-by-year modes. Resolving Skips Following a run, you should check the report's final Missed section to see if any files were skipped due to: Normally skipped name patterns These are not errors, and include both Unix .* hidden items, and items in subfolders matching a configurable skip pattern. They are also noted in Skipping message lines at the top of the report. Duplicate content These are not errors, but are skipped by design as described in Overview above. They are also noted in ***Duplicate message lines earlier in the report. File-transfer errors These are genuine errors, but do not stop the program: other files are processed after the error is encountered. They are also noted in ***Error message lines earlier in the report. All items skipped are left intact in the source tree, and listed in the Missed section. If the Missed line shows 0 skips, or if you are okay with the items skipped, delete the contents of your source folder after the run if desired; if there were no skips, it's just empty directories (but see also the mode variations note ahead). If the Missed line's skips is not 0 and valid items were skipped due to errors, resolve their issues (e.g., fix locks or permissions, or use a shorter destination path on Windows) and rerun tagpix to transfer them. For the rerun, use the same source and destination folders as the original run, and do not delete the prior run's results (at prompts #2, #3, and #6). Mode variations: most of the above pertains to file-move and copy-and-delete transfer modes only. When tagpix is run in copy-only mode, added in version 2.1, it does not produce a Missed line or section in the results report, because no files are removed from the source tree. Instead, the end of the report in this mode concludes with a message Nothing was removed from the source tree. To analyze skips in copy-only mode, search for messages earlier in the report, as described for the three skip categories listed above. Usage Modes Depending on the replies you provide to input prompts, you can use this script to either extend an existing archive or make one anew, and can do both with the aid of another program: To extend an archive (e.g., for viewing, or full optical-disc burn), for prompt #3 give the same destination-folder path as a prior run (i.e., the path to the folder containing a prior run's MERGED result folder), and answer no to #6 prompts; new source items will be moved (or copied) to the prior run's folders. To make a new archive (e.g., for an initial or incremental optical-disc burn), for prompt #3 give a new destination-folder path, perhaps with the run date in its name; source items will be moved (or copied) to the new archive's folders. To add new items to both an incremental archive for burning and an existing archive for viewing, use the preceding mode B first, and then merge the new archive's contents into an existing archive with another tool (a GUI cut/paste or drag-and-drop will generally suffice). For an example of usage mode A, see the logs here and here. For an example of mode B, see the log here. For additional usage-mode examples, see the full examples folder. For alternative file transfer modes, see version 2.1 release notes. Other Usage Notes This section collects smaller usage notes and tips. Some summarize earlier coverage. Result path lengths The combination of folder names and date-of-origin prefixes created by tagpix can be 31 characters long, not counting photo base names (e.g., MERGED/PHOTOS/2018/2010-12-03__). If merged results exceed pathname limits on your platform, try using a shorter destination path (i.e., a folder higher on your drive). Preventing changes tagpix makes no changes if the source folder does not exist; the user cancels the run verification or requests a list-only run (via prompts #1 or #5); or the script is killed while waiting for any input (e.g., control+C in a console, or a kill request in an IDE). As of version 2.1, you can also prevent source-tree changes by enabling copy-only file transfer mode. Reruns on prior results It's safe to rerun tagpix on items and folders it created in the past, because it automatically detects and discards any extra date prefixes (the YYYY-MM-DD parts) added to filenames by prior tagpix runs. It also ensures the new and prior dates match, to avoid stripping any user-added text in the process. Duplicates are handled automatically Per the overview above, it's safe to run tagpix to combine trees with duplicate item copies: they are automatically skipped (for duplicate content) or renamed (for duplicates filenames). Redundant Android dates are dropped automatically Per the release notes ahead, tagpix discards dates added to filenames by Android cameras that are redundant with dates added by tagpix itself. This keeps your image filenames shorter and is generally what you'll want. For more control, you can also disable and customize this feature with configurations. Rerunning after errors It's safe to rerun the script if it exits early, or skips items due to file-transfer errors described earlier. The next run will simply rename and transfer all the items left in the source folder (but be careful not to delete the prior run's results when asked and verified by prompt #6!). Source-folder content tagpix always skips both hidden files whose names begin with a . (e.g., Mac OS .DS_Store files), as well as all items in subfolders whose names match the user-configurable skips pattern added in version 2.1 (described ahead). All other items in the source tree are transferred to the destination's folders. See also Resolving Skips above. Choosing folders to merge As a rule of thumb, files that are not movies or photos with date-taken tags may be better left out of the tree that tagpix will merge. This includes both scanned photos, whose dates will all reflect scan date instead of event date, and images such as PNGs and GIFs that have no date-taken information. You can merge these too, but scans will be renamed with their scan date (which probably won't be useful alongside photos' date-taken), and images of untagged types will wind up in the OTHERS folder instead of PHOTOS (which merits a separate note, up next). Moving OTHERS images to PHOTOS Speaking of the OTHERS results folder: by design, tagpix recognizes photos as images with MIME types that imply Exif tags (as described earlier), and always moves other image types to the OTHERS folder, not PHOTOS. This means that PHOTOS gets all JPEGs and TIFFs (Exif tags or not), but non-photo image types like PNGs, GIFs, and BMPs are routed to OTHERS. If you'd rather see the latter bunch in PHOTOS too, simply move them across manually after a tagpix run; because items in OTHERS are also labeled with dates, they'll work well in PHOTOS alongside your camera JPEGs. Request for comments: if you think that combining all image types as described here should be automated with a new tagpix option, please send feedback via the Input link in this guide's bottom toolbar. To date, no user (including tagpix's creator) has asserted a need for this, and software growth sans use case is a Generally Bad Thing. Dates, not times Time is not included in filename prefixes, because it would make names longer, and camera-added sequence numbers will normally suffice to identify and order photos taken on the same day. Dates are more crucial, as different cameras may use the same sequence numbers. Note that Android cameras may already have a time in their filenames, which tagpix retains, and makes names as unique as sequence numbers. Modification dates, not creation dates When picking a date-of-origin prefix, tagpix uses a file's modification date (via Python's os.path.getmtime()) as a last resort, after trying photo Exif tags and then Android filename date (per this). Modification date reflects either the file's creation date (if it has not been edited), or its latest modification (if it has); for unretouched photos, this is normally the true date of origin. It's worth noting that tagpix by design does not try to use a file's creation date—a datum dependent on both operating system and filesystem. Specifically, file creation date is generally available on Windows only (not on Unix, where it is weakly supported on Mac OS and no better than modification time on Linux), and even where available can sometimes be irrelevant when content changes. For background, try this discussion thread, this filesystems comparison, and Python's os.path.getctime() and os.stat(). Because tagpix works in the woefully unstandardized filesystems realm, it must use modification dates in the name of portability, interoperability, and results that are the same across all supported platforms. Run other tools on destination folders, not source Because tagpix's default transfer mode separates images from other content in source folders, it may impact the results of other tools that store data alongside images. For example, tagpix will destroy a thumbspage gallery in a source folder, by separating its index page, thumbnails subfolder, and images. The PyPhoto viewer may be similarly neutered, because its thumbnails-cache files and images are moved to different destinations. This cannot be remedied (merging metadata of arbitrary tools is impossible), but you can avoid the issue altogether by applying such tools to tagpix destination folders only, not source folders. That is, run other tools on merged results, not unmerged input. Because merged destination folders are only ever extended, their content is never scattered by tagpix. Source folders are generally best used for staging photos to be later moved by tagpix, per the recommendations ahead. Modes update: though it comes with some tradeoffs, version 2.1's new copy-only mode can now be used to extract images from a source tree without destroying it. See 2.1's release notes ahead. The preceding still applies to both the original and default file-move mode, as well as the new copy-and-delete mode. Moves across drives and devices tagpix uses Python's os.rename() to move files from source to destination, which is normally correct, fast, and atomic. File moves can be problematic, though, when run between different devices or filesystems. If a run's moves all fail due to differing devices, make sure your source and destination folders reside on the same writable device—copy the source folder to the same hard drive or SSD as your destination folder, before the tagpix run. This is a minor inconvenience, but makes all tagpix runs quicker, and copying new source images to a temporary staging folder is recommended practice anyhow; merging from a camera or camera card directly leaves no backup copy if anything goes wrong. Developers notes: Python's os.replace() doesn't help here, because it still raises an exception across different drives and devices on Windows, Mac OS, and Linux (this call just avoids Windows exceptions if the target file exists on the same device). The only alternative to moves is to copy and delete, which can be much slower for large photo archives, and cross-device moves seem too rare and dangerous to justify the slowdown for all use cases—especially when a manual pre-run copy of the source folder takes roughly the same amount of time. Modes update: though they come with some tradeoffs, version 2.1's new copy-only and copy-and-delete modes can now be used to merge across different drives and devices directly. See 2.1's release notes ahead. The preceding still applies to the original and default file-move mode. Recent Changes This section describes changes made in recent tagpix versions. It is meant primarily for developers and prior-version users, though additional usage-level details and context are presented along the way. tagpix is occasionally repackaged with minor documentation-only changes (e.g., to this doc and its demos), but code and functionality changes occur only in the versions listed here. Version 2.3: Silence Pillow DOS Warning tagpix was patched and rereleased on September 29, 2020 with two upgrades. The first was a minor UI improvement: at input prompts, typing control+C to exit now yields a user-friendly message instead of a Python exception traceback, and source-file existence is checked ASAP. For example: ~/Desktop/camera$ python3 ~/MY-STUFF/Code/tagpix/tagpix.py tagpix renames and moves photos to a merged folder; proceed? y Source - pathname of folder with photos to be moved? ^C Script not run: no changes made. ~/Desktop/camera$ python3 ~/MY-STUFF/Code/tagpix/tagpix.py tagpix renames and moves photos to a merged folder; proceed? y Source - pathname of folder with photos to be moved? Spam Script not run: source folder does not exist, no changes made. The second upgrade was more urgent: code was added to silence a bogus DecompressionBombWarning message now issued senselessly by the underlying Pillow library for all large images. Specifically, when running tagpix on images larger than 89MP, the Pillow library by default prints a single DOS (denial of service) warning message in program output that looks like this (with line-breaks added here for marginal readability): /Library/Frameworks/Python.framework/Versions/3.8/lib/python3.8/site-packages/PIL/Image.py:2797: DecompressionBombWarning: Image size (108000000 pixels) exceeds limit of 89478485 pixels, could be decompression bomb DOS attack. warnings.warn( This baseless warning is completely harmless, and does not impact tagpix results (large images work either way). But it's also stupidly excessive, and needlessly confuses users of this and many other Pillow-based programs. It was first seen for perfectly valid 108MP images shot on a Galaxy Note20 Ultra smartphone in 2020, and will crop up for large images created on many other devices and tools in widespread use. Obviously, these are not "attacks," despite the warning's language. Users who see this, however, may assume it reflects bugs or viruses. To see the changes applied to silence the message, search for Sep-2020 in the source; the fix was trivial, but the cost of rereleasing this and other programs impacted at tagpix's host site was not. Such is life when "batteries included" meets open-source agendas. Postscript: though scantly documented, it turns out that Pillow later turned the warning described here into a full error for images larger than twice the warning's size limit. This error takes the form of an exception that will cause client programs to fail or terminate. Despite this, its only mention seems to be in an obscure release note. To avoid kills, tagpix's warning-silencing code has been updated to use a new and broader fix—which will suffice only until Pillow tightens the screws again. This check should clearly be opt in for programs that need to care. Version 2.2: Use and Drop Android Dates Version 2.2—finalized on December 2018—was a minor release that addressed just one specific issue. Specifically, it was enhanced to automatically process origin dates added to photo filenames by Android cameras: it utilizes these dates if no Exif date-taken tag is present, and discards these dates (but not times) to avoid redundancy with tagpix-added dates. A utility script was also coded to drop Android filename dates on demand for users of prior tagpix releases. This change applies only to tagpix users who have shot photos on Android devices, or may do so in the future. Given the potential magnitude of this subset, though, the rest of this section provides complete coverage. For a brief look at this change's results, see this log and shot. For the full story, read on. The Issue Most digital cameras assign filenames to images using a simple format that accommodates the basic but portable FAT filesystem's 8.3 naming convention. For instance, a DSC or IMG prefix followed by a sequence number suffices to identify images on a given camera, though not across different cameras—one of the main limitations tagpix solves, by expanding the first of the following forms to the second, with a date-of-origin prefix: DSC03249.JPG 2018-02-05__DSC03249.JPG By contrast, cameras on some Android devices (and perhaps others) add a date in photo filenames which, combined with an added time, identifies images by their moment of creation, but is redundant with that added by tagpix's own renaming logic. For example, such images' filenames are initially expanded by tagpix from the first of the following to the second: 20180205_154910.jpg 2018-02-05__20180205_154910.jpg While the Android-added date and time (separated by _ in the first name above) might be a good idea in a world begun anew, they bifurcate the digital-photos world that is. This is a unique and nonstandard naming scheme, that stamps files with a date that makes tagpix filenames longer unnecessarily, and in most cases is fully redundant with both standard in-file Exif creation-date tags (when present and unchanged), and the date-of-origin prefix added to all photos by tagpix (when its source agrees with the Android stamp). Using Android Dates That said, blindly deleting the Android date in filenames is too extreme, because it may be the only record of creation date in some scenarios. For example, Android photos edited in tools that discard Exif tags won't have a date-taken tag, but will retain a creation date in their filenames that normally differs from the file's modification date (which is generally a last-edit date). More subtly, some recent Samsung Android devices never record Exif date-taken tags for front—a.k.a. "selfie"—cameras. This is a known issue that you can explore on the web here and here. It may be a temporary bug that Samsung will fix in an update, and back cameras on these devices do record Exif dates correctly. But discarding the Android filename date of photos shot on such devices' front cameras would also drop valuable metadata found nowhere else. Because the filename date is potentially useful in such cases, tagpix 2.2 has generalized the way it chooses a date of origin to be used for the prefix it adds to filenames. Formally, it always now tries three sources in turn, until a date is selected: Use the Exif date-taken tag, if present Else use the Android filename date, if present Else use the file's modification date as a last resort The net effect selects the best date of origin possible for filename prefixes—a crucial part of tagpix's organizational role. The first step above is applied to photos only (other content type doesn't have Exif tags). The other steps are run for all types of content in source trees, including photos without usable Exif tags. The second step above is new, requires heuristics to detect dates, and applies only to a subset of users and images, but is necessary to accommodate metadata recorded outside the Exif model by a handful of devices and manufacturers. A special case to be sure, but exceptions seem as much the norm in the digital camera domain as the computer field at large! Because step two is partly heuristic—it looks for matching strings and checks their content for valid dates—it can also be disabled by setting UseAndroidFilenameDates in the user configs file. This switch is preset to True to cover the norm; set it to False in the unusual event that filenames in your source tree appear to embed Android dates just by coincidence. Dropping Android Dates After the tagpix date has been selected per the prior section, tagpix 2.2 addresses the redundancy of Android filename dates with a new renaming step, run before duplicates detection and file move or copy. If enabled by setting DropAndroidFilenameDates to True in the user configs file, the tagpix.py main script now automatically renames merged photo files to drop the superfluous Android date and keep only the tagpix date (along with the Android-added time, which helps identify the photo). For instance, it shortens from the first of the following to the second: 2018-02-05__20180205_154910.jpg 2018-02-05__154910.jpg This step is enabled by default, because it yields shorter names, and normally has no impact on duplicates processing or content access—the shorter form is no less unique or meaningful than the longer. The tagpix date is usually the same as the Android date, whether it is taken from Exif tags or filename. As a special case, though, this new renaming step can also be specialized with switch KeepDifferingAndroidFilenameDates to drop only Android dates that are the same as the tagpix date. Though unlikely, the two dates may differ if a photo's Exif-tag date is not the same as its Android-filename date—which is generally possible only after manual changes to either, given tagpix's date-selection algorithm. In such rare cases, the tagpix and Android dates may disagree, as in the following inconsistently changed photo: 2018-08-03__20180408_073757.jpg Set the keep switch to True in the user configs file if you wish to retain the Android date when it differs this way. This switch defaults to True to be cautious, because an auto-shortened filename carries less information in this case only. Still, this case seems too unlikely to apply to most, if any, users (and if it does apply to you, you probably understand both the perils of manual metadata changes, and the need for such an obscure switch!). For an example of 2.2's automatic handling of Android filename dates, see the console log here, and the screenshot of its results folder here. In the end, the combination of using and dropping such dates shortens filenames of all photos shot on Android cameras, without sacrificing filename metadata when useful. On-Demand Renaming For more specialized roles, 2.2 also adds a new utility script _drop-redundant-dates.py, which can be run on demand to drop all Android dates in images already processed by a former version of tagpix (or a later version run with auto-renaming disabled). This utility script is never required for users of tagpix 2.2+ if auto-renaming is enabled, and usually must be run just once by pre-2.2 users who have upgraded. It is also somewhat naive: it makes no attempt to determine if the Android date dropped differs from that of the tagpix date formerly added. Be sure to use its list-only mode to preview changes before running it to update photos; because prior versions of tagpix didn't use filename dates in the absence of Exif dates, some formerly-merged Android photos may be labeled with file-modification date instead. One special case here: as described in the new utility script's docstring, if you're using a tool that relies on the names of images, you may need to rerun the tool after running the utility script, to pick up the new names. This requirement naturally varies per tool. For instance, the HTML viewer pages generated by the thumbspage gallery builder hardcode image filenames, which can be invalidated by later renames. On the other hand, this isn't a concern for the PyPhoto GUI viewer, which updates its thumbnails cache automatically on image changes. This special case is also completely irrelevant when using the 2.2 automatic renaming of tagpix.py, because its renaming occurs before other tools can be run on its merged results. Where possible, use automatic renaming instead of the on-demand utility script. Request for comments: there undoubtedly are additional device-specific photo-naming conventions beyond the Android camera pattern addressed here (e.g., some Windows screenshot names may redundantly embed date/time information too). If you'd like to see other filenames accommodated by tagpix, please send feedback via the Input link in this doc's bottom toolbar. As it stands, device manufacturers seem to be climbing over each other to come up with proprietary naming conventions with no interest in standardization or interoperability, and supporting all the constantly changing variants in this context would be akin to herding cats. Version 2.1: Multiple Enhancements Version 2.1—finalized on October 2018—was a major update, which generalized source-tree subfolder skips; added a simple but crucial deletion verification; improved duplicates detection; introduced new file-transfer modes that copy instead of move; and cleaned up a few dark but rare corners. Code refactoring, user configs file Some code was refactored to remove redundancy (including three same-work loops merged into one: see moveall()). This had no impact on program operation or results, but makes future changes easier. A new file was also added for user configurations, user_configs.py. This has only a small number of settings a present but better supports future customizations. Subfolder skips enhancements Version 2.1 generalizes the code that skips source-tree subfolders to use a regular expression pattern that can be more easily modified by users to skip additional folders. To extend or customize the set of subfolders skipped, modify the setting for variable IgnoreFoldersPattern in the user-configurations file user_configs.py. This pattern's new preset skips .* hidden folders; thumbs thumbnail folders created by some tools (including older versions of PyPhoto that predate its single-file caches); and _thumbspage thumbnail/viewer-page folders created by the latest thumbspage image-gallery builder. For a demo of 2.1 subfolder skipping, see this example. Note that this matters only for subfolders having irrelevant images (e.g., thumbnails); applies only to folders in your source tree (the destination tree is not scanned for images to add to the collection); and is not required if your source folders to be skipped are named with a leading . (the pattern preset already skips all such folders, though some zip and backup tools may skip them too). The code now also correctly skips multiple matching folders when present. Prior-output deletion verifications Version 2.1 now verifies deletion of prior-run outputs with an extra input after each prompt #6, because the deletion is immediate (and if unintended might be catastrophic!). Reply with an n or simply press the Enter/return key to cancel the delete (a control+C at any prompt works to kill the program in general, but may be too late for weary users to apply): Delete all prior-run outputs in "./MERGED/PHOTOS"? y ....About to delete: ARE YOU SURE? n Delete all prior-run outputs in "./MERGED/OTHERS"? y ....About to delete: ARE YOU SURE? Duplicate ID numbers per file, not category Version 2.1 now assigns unique ID sequence numbers per individual file, not across an entire content category. These numbers are used to create unique filenames, for files of the same name but different content. The original tagpix used a single per-run counter; 2.0 used 3 per-category counters; and 2.1 now counts up from 1 for each file with duplicates. This makes duplicate filenames more coherent (they are numbered strictly 1..N), but is also crucial for detecting duplicate content across all of a filenames' variants, as required by the next item; when IDs were unique within a category only, a prior run's IDs might be arbitrarily higher for a given filename than those of a later run, make same-duplicate detection difficult. Improved handling of rare duplicate cases Version 2.1 repairs a minor defect that was never observed in 5 years of practice, and seems about as likely to occur as lightning striking the machine running the script. But: if there were three source image files with the same filename; and two of these files' content differed from the first moved; and the two duplicate files were merged to the first-moved's destination folder by two different tagpix runs; and the numeric-ID suffix added to the two duplicate files' names happened to be the same on each of the different runs; then the filename generated for the two duplicates might be the same—causing an exception on Windows, and overwrites on Unix. The simple fix, in moveone() of the script, is to increment the numeric-ID suffix in a loop, until the resulting filename either does not exist in the destination folder or matches an existing same-named file there by content (as formerly done in the related music-file program flatten-itunes). This avoids file overwrites in all contexts (the former defect), but also correctly skips all same-content images for a given filename—whether they match the first instance of the filename moved to the destination (as before), or any differing-content duplicate added later with a uniquely suffixed ID (new behavior). For a short demo of the new duplicates-resolution logic in action, see this example. The new behavior—skipping duplicates having content the same as another duplicate—addresses the unlikely event of modified copies being copied to multiple folders unmodified. This works well and as it should, but is also the tagpix equivalent of a second lightning strike... New copy-only and copy-and-delete mode options Version 2.1 adds both copy-only and copy-and-delete file transfer modes, enabled by settings in user_configs.py. These are alternatives to the original and default file-move mode, which always removes files from the source tree by definition. The two new modes copy source files byte-for-byte to the destination, instead of directly moving them. This makes the new modes run slower, but in some roles can make manual source-content copies unnecessary, and lets you use tagpix in additional contexts: Copy-and-delete mode Allows tagpix to work with source and destination folders on different devices. For instance, this mode can be used to run merges between a camera card or USB flashdrive, and a PC's internal drive. Direct moves fail when source and destination folders are on different drives. Copy-only mode Allows tagpix to extract images from a source tree without changing the tree's contents in any way. For example, this mode can be used to collect images from gallery or viewer folders, while leaving those folders intact. Direct moves may separate, and thereby destroy, the content of such folders. Like copy-and-delete, copy-only mode can also be used when folders reside on different drives. In short, these two new modes provide extra utility, as captured in this example. Nevertheless, the original file-move mode is still the tagpix preset default, both because moves always run faster than copies, and because this mode promotes better practice. In terms of practice: Because the new copy-only mode never removes anything from the source tree, files may accumulate there over time, requiring duplicate tests and skips. This can make tagpix very slow—if 1,000 images linger in a source folder from prior runs, all 1,000 must be compared and skipped on every future run. The new copy-and-delete mode does remove files from the source tree, but also might encourage users to skip making temporary copies of camera storage altogether. This can be a dangerous practice—if tagpix deletes files from your camera storage directly, there will be no backup copy if anything goes wrong. Hence, as both general rule and recommended usage: copy your initial or new source images to a temporary staging folder to be used as the tagpix source tree, and use the default file-move mode. Unless your use case is more custom, this is still the best and safest way to use tagpix. Version 2.0: Numerous Upgrades Version 2.0—finalized on October 2017—was a major step up from the former, simplistic script, as summarized below. Changes Made Among version 2.0's foremost improvements, it now: Parameters Gets all run parameters as console inputs (not code variables). Command-line arguments are not used, because they are cryptic; to provide input programmatically, redirect stdin to a file of precoded replies—or a shell in-script 'here' document, as described above and later here. Per earlier, also sends prompts to stderr so stdout report text can be saved for easier review. List-only mode Adds an option to list planned changes only, making no changes. Use this to inspect and verify proposed changes without applying them. Year subfolders Adds an option to group the resulting flat folders into by-year subfolders automatically (for photos, movies, and others). TIFFs and mimetypes Handles non-JPEG images by using Python's mimetypes module, so other images may be treated as photos too. Still, because Exif tags are apparently used only by JPEG and TIFF images and WAV audio (PNG and WebP images may have metadata too, but their standards and support are evolving), only JPEG and TIFF mime types are treated as 'photos' here; others go to the OTHERS folder: as images, but not photos. For more details, try this page or a web search. 2.0 also uses mimetypes for movie detection, adding newer video types in case some platforms do not. Source folder Allows the source folder to be separate from this script's own folder. Moving huge photo archives to a temp folder can be expensive (one subject folder was 75G). To use the prior model, copy images to ./SOURCE (in the current working directory (CWD), which is the script's own folder if it's run from there), and press Enter when asked for the source folder's path. Destination folder Allows the results folder to be separate from this script's own folder (i.e., CWD). This in turn allows the program to extend a prior run's results when desired, instead of always making a new archive folder (see Usage Modes). To use the prior model, press Enter when asked for the destination folder's path, and copy results from ./MERGED. Movies folder Moves all video mime-type files to a new MOVIES subfolder, instead of lumping them in with OTHERS as before (or PHOTOS). Additional changes Addresses additional issues cut short here for space—see the code for more details on the following: Verifies runs by console input (e.g., if clicked accidentally) Catches/reports move errors and continues (e.g., permissions, locks, path lengths) Avoids calling the photo-tag extractor for non-image files Traps missing tags to avoid a generic None-indexing error message Detects/strips an existing filename tag prefix from a prior run Skips truly duplicate content, per the new scheme in Overview Dups do ID+=1 instead of using enumerate(): ID was too high if many items Skips thumbs/ thumbnail subfolders created by programs like PyPhoto Skips .* Unix hidden items like Mac .DS_Store files (but they may reform!) Open Issues Despite its upgrades, version 2.0 left the following issues on the table (see also the later changes in 2.1 and 2.2): Report location This release allows its output to be routed to a file with its stderr/stdout split model, but it could instead always save the report in the MERGED root folder of the results, with an appended date/time suffix. This was not implemented because the reports might become unwelcome trash after many runs, but that rationale is open to debate. Windows path lengths tagpix could support too-long pathnames on Windows with the \\?\ pathname-prefix trick (like Mergeall and ziptools). But this case is rare, it can be addressed by using a shorter (higher) destination-folder path, and users may not be able to view the results in Explorer anyhow. Punt in this release, but revisit if feedback warrants (see Input in the toolbar below). Prior to version 2.0, thumbspage was a basic, tactical script that was neither robust nor customizable. And then it was used. Usage Caution tagpix has been tested extensively and used successfully on extremely large photo collections, including all those of its creator, and it will likely perform well on yours too. It is provided freely because it can help you organize your photo libraries. Especially given the many ways that computers can fail, however, a word of caution is in order: By design, this script's default operation moves and renames all photos and other files in an entire source folder tree. No automated method for undoing the changes it makes is provided, and no warranty is included with this program. Please read all usage details in this document carefully before running tagpix on your photos. It is strongly recommended to preview changes with list-only mode before applying them; and either run tagpix on a temporary copy of your source folder tree, or enable its copy-only transfer mode in file user_configs.py to avoid source-tree changes. Lest that sound too dire, keep in mind that tagpix never changes photo content (it transfers and renames them only), and errors simply leave items in their original location in all transfer modes (a rerun can propagate them to the destination). Moreover, if you always copy/paste new images from your camera's storage to a tagpix staging folder (per the preceding notebox's recommendation), the camera's storage will automatically serve as a backup copy, regardless of this program's operation. Still, the importance of your photos merits a complete understanding of any tool that modifies them—this one included. Top Code Page News Blog Apps Input ©M.Lutz https://learning-python.com/tagpix/UserGuide.html#21copymodes
sectionhttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
toohttps://learning-python.com/tagpix/UserGuide.html#otherimages
folderhttps://learning-python.com/tagpix/screenshots/
flathttps://learning-python.com/tagpix/screenshots/results-flat.png
group-by-yearhttps://learning-python.com/tagpix/screenshots/results-grouped.png
reporthttps://learning-python.com/tagpix/UserGuide.html#Results Report
skip patternhttps://learning-python.com/tagpix/UserGuide.html#21folderskips
Overviewhttps://learning-python.com/tagpix/UserGuide.html#duplicates
promptshttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
version 2.1https://learning-python.com/tagpix/UserGuide.html#21copymodes
promptshttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
copied) to the prior run's folders. To make a new archive (e.g., for an initial or incremental optical-disc burn), for prompt #3 give a new destination-folder path, perhaps with the run date in its name; source items will be moved (or copied) to the new archive's folders. To add new items to both an incremental archive for burning and an existing archive for viewing, use the preceding mode B first, and then merge the new archive's contents into an existing archive with another tool (a GUI cut/paste or drag-and-drop will generally suffice). https://learning-python.com/tagpix/UserGuide.html#21copymodes
copied) to the new archive's folders. To add new items to both an incremental archive for burning and an existing archive for viewing, use the preceding mode B first, and then merge the new archive's contents into an existing archive with another tool (a GUI cut/paste or drag-and-drop will generally suffice). https://learning-python.com/tagpix/UserGuide.html#21copymodes
herehttps://learning-python.com/tagpix/examples/large-13k-photo-extending.txt
herehttps://learning-python.com/tagpix/examples/demo-5-newitems.txt
herehttps://learning-python.com/tagpix/examples/tx-run-on-android-log.txt
folderhttps://learning-python.com/tagpix/examples/
release noteshttps://learning-python.com/tagpix/UserGuide.html#21copymodes
promptshttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
modehttps://learning-python.com/tagpix/UserGuide.html#21copymodes
overviewhttps://learning-python.com/tagpix/UserGuide.html#Overview
release noteshttps://learning-python.com/tagpix/UserGuide.html#Version 2.2
earlierhttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
prompthttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
aheadhttps://learning-python.com/tagpix/UserGuide.html#21folderskips
Resolving Skipshttps://learning-python.com/tagpix/UserGuide.html#Resolving Skips
earlierhttps://learning-python.com/tagpix/UserGuide.html#Overview
retainshttps://learning-python.com/tagpix/UserGuide.html#Version 2.2
thishttps://learning-python.com/tagpix/UserGuide.html#useandroiddates
threadhttps://stackoverflow.com/questions/237079/how-to-get-file-creation-modification-date-times-in-python/39501288
comparisonhttps://en.wikipedia.org/wiki/Comparison_of_file_systems#Metadata
os.path.getctime()https://docs.python.org/3/library/os.path.html?highlight=os%20path%20getctime#os.path.getctime
os.stat()https://docs.python.org/3/library/os.html?highlight=os%20stat#os.stat
thumbspagehttp://learning-python.com/thumbspage.html
PyPhotohttp://learning-python.com/pygadgets.html
aheadhttps://learning-python.com/tagpix/UserGuide.html#21copymodes
release noteshttps://learning-python.com/tagpix/UserGuide.html#21copymodes
os.rename()https://docs.python.org/3/library/os.html#os.rename
os.replace()https://docs.python.org/3/library/os.html#os.replace
release noteshttps://learning-python.com/tagpix/UserGuide.html#21copymodes
demoshttps://learning-python.com/tagpix/screenshots/index.html
Pillowhttps://pypi.org/project/Pillow/
imageshttps://pillow.readthedocs.io/en/stable/reference/Image.html#PIL.Image.open
programshttps://duckduckgo.com/?q=Pillow+DecompressionBombWarning
sourcehttps://learning-python.com/tagpix/tagpix.py
otherhttps://learning-python.com/thumbspage/UserGuide.html#pillowdoswarning
release notehttps://pillow.readthedocs.io/en/stable/releasenotes/5.0.0.html
broader fixhttps://learning-python.com/tagpix/viewer_thumbs.py
loghttps://learning-python.com/tagpix/examples/2.2-use-and-drop-android-dates.txt
shothttps://learning-python.com/tagpix/screenshots/results-2.2-android.png
8.3https://duckduckgo.com/html?q=camera%20filenames%20fat%208.3
herehttps://forums.androidcentral.com/samsung-galaxy-s9-s9-plus/914277-front-camera-photos-have-no-photo-date-time.html
herehttps://www.google.com/search?q=samsung+front+camera+exif+date
heuristichttps://en.wikipedia.org/wiki/Heuristic
user configshttps://learning-python.com/tagpix/user_configs.py
user configshttps://learning-python.com/tagpix/user_configs.py
tagpix.pyhttps://learning-python.com/tagpix/tagpix.py
algorithmhttps://learning-python.com/tagpix/UserGuide.html#useandroiddates
user configshttps://learning-python.com/tagpix/user_configs.py
herehttps://learning-python.com/tagpix/examples/2.2-use-and-drop-android-dates.txt
herehttps://learning-python.com/tagpix/screenshots/results-2.2-android.png
_drop-redundant-dates.pyhttps://learning-python.com/tagpix/_drop-redundant-dates.py
thumbspagehttp://learning-python.com/thumbspage.html
PyPhotohttp://learning-python.com/pygadgets.html
moveall()https://learning-python.com/tagpix/tagpix.py
user_configs.pyhttps://learning-python.com/tagpix/user_configs.py
user_configs.pyhttps://learning-python.com/tagpix/user_configs.py
PyPhotohttp://learning-python.com/pygadgets.html
thumbspagehttp://learning-python.com/thumbspage.html
this examplehttps://learning-python.com/tagpix/examples/2.1-subfolder-skipping.txt
prompthttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
contenthttps://learning-python.com/tagpix/UserGuide.html#Overview
the scripthttps://learning-python.com/tagpix/tagpix.py
flatten-ituneshttp://learning-python.com/flatten-itunes-2.py
this examplehttps://learning-python.com/tagpix/examples/2.1-duplicates-resolution.txt
user_configs.pyhttps://learning-python.com/tagpix/user_configs.py
this examplehttps://learning-python.com/tagpix/examples/2.1-all-transfer-modes.txt
herehttps://learning-python.com/tagpix/UserGuide.html#automatedinputs
earlierhttps://learning-python.com/tagpix/UserGuide.html#Results Report
this pagehttps://en.wikipedia.org/wiki/Exif
Usage Modeshttps://learning-python.com/tagpix/UserGuide.html#Usage Modes
codehttps://learning-python.com/tagpix/tagpix.py
Overviewhttps://learning-python.com/tagpix/UserGuide.html#Overview
PyPhotohttp://learning-python.com/pygadgets.html
2.1https://learning-python.com/tagpix/UserGuide.html#Version 2.1
2.2https://learning-python.com/tagpix/UserGuide.html#Version 2.2
filehttps://learning-python.com/tagpix/UserGuide.html#Results Report
Mergeallhttp://learning-python.com/mergeall.html
ziptoolshttp://learning-python.com/ziptools.html
list-onlyhttps://learning-python.com/tagpix/UserGuide.html#Input Prompts
copy-onlyhttps://learning-python.com/tagpix/UserGuide.html#21copymodes
user_configs.pyhttps://learning-python.com/tagpix/user_configs.py
http://learning-python.com/index.html
Tophttps://learning-python.com/tagpix/UserGuide.html
Codehttp://learning-python.com/tagpix/
Pagehttp://learning-python.com/tagpix.html
Newshttp://learning-python.com/post-release-updates.html
Bloghttp://learning-python.com/posts.html
Appshttp://learning-python.com/programs.html

Viewport: width=device-width, initial-scale=1.0


URLs of crawlers that visited me.