I had a great week. I finished the basics of the new file maintenance system I wanted, cleaned up the duplicate filter a little more, and fixed a bunch of bugs.
file maintenance system
There are a number of large file re-checking jobs the client wants to do, both now and in the future. Going back to figure out more accurate video durations and image rotations, discovering webms that were formerly incorrectly detected as mkvs, eventually integrating videos into the duplicate checking system, all of these will require a combined whack of maintenance CPU that I don't want to hit all at once. I have previously sketched out some disparate systems for these jobs, but none were really doing the trick, so this week I unified it all into one nice system that can handle all sorts of jobs. This new system is simple for now but will get more work in future.
You do not have to do anything, but if you pay attention to your maintenance work, you may notice some new file metadata and thumbnail jobs running in the background or on shutdown. This new system fits into normal maintenance just like database analyzing or repository processing. You can govern whether it is permitted to run in regular idle time and/or shutdown time under options->maintenance and processing and also change its in-built 'throttle', which limits the number of files it will work on (rather than running full bore on what in future may be quite large jobs). The default throttle is 200 files every day, which for most jobs on most machines will be about 30 seconds to three minutes work. Do not expect it to do much work yet.
Existing file regeneration routines now work through this system, and it does its job much better than before. If you hit right-click->regenerate->x on some thumbnails, the job now runs in a regular popup button (rather than the locking 'modal' one from before), letting you keep browsing while it works. And if you select more than 50 thumbnails (think, say, right-clicking on 2,000 video files and saying to regenerate their thumbnails if they are the wrong size), you will now get the option to schedule that big job for later, at which point those 2,000 jobs will end up in the normal idle maintenance queue, to work at 200 files a day or whatever you wish.
This system is fairly opaque at the moment. You can trigger it with the thumbnail right-click, and certain db operations may schedule new jobs for it, but there is no UI to review it yet. In the coming weeks, I expect to write a new 'review' window off the database menu that will let you review total pending jobs, start work manually, and add and remove pending jobs en masse through the regular search interface. I'll slowly integrate more of the client into it as well, letting it add more jobs into the queue by itself.
Let me know how this all works for you!
The duplicate filter interface got some more work this week, particularly in cleaning up some of my original version's over-engineering. The actions you can choose on the right panel are now split more clearly into 'yes, these files are duplicates, and here is how' decisions vs the 'alternates' and 'not dupes' decisions. Also, 'this file is better' is now split into two buttons for 'delete the worse file' and 'keep both'. This 'delete or not' is split at the shortcut level into two actions as well, if you wish to map both. Existing shortcuts (left-click by default in the filter) will update to the 'and delete the worse file' version.
The complicated 'duplicate action options' object (which governs how to merge metadata across duplicates) therefore no longer handles file deletion. It is also now only attached to the 'better/worse' and 'files are the same' actions–we never found a good reason to merge metadata across all alternates or 'not duplicates', so I have removed it completely. If you want a complicated file delete action, hitting the 'custom action' button now asks you if you wish to delete the file you are looking at, the other one, or both.
Also, to reduce confusion with alternates–which are also technically not duplicates–'not duplicates' is now renamed across the program to the more precise 'not related/false positive'. The 'false positive' action is a record in the db saying 'despite the similar files search thinking these files were related, it was incorrect, so do not bring it up again'.
My hope is that filtering is a bit faster here. If two duplicate files are of very different quality, it is still easy to delete the bad one, but if they are more close and you want to keep both, it is now just one click.
As for the big db-level rewrite, I prepped the duplicate db code for it this week. I am standing at the cliff-edge and feel great about jumping off, so next week I hope to get started on the new code properly and migrate one or both of the current 'alternates' and 'false positive' data to the new system.
I fixed an issue with the recent 'collect by' session saving where the accompanying sort was not being renewed on a session load. Also, several problems with collected media and sort by 'approx bitrate' are fixed.
There's a new checkbox under options->sort/collect that makes it so the default sort is updated every time you click a new sort in regular browsing. It sounds a pain but is actually pretty neat!
The 'all local files' domain is now hidden from view in new page selection and the tag autocomplete dropdown if you are not in advanced mode. This domain, which is fairly technical and covers both trash and 'my files' and the sometimes-hidden repository update files, is often confusing to new users and is rarely useful even for people who know what it does.
If you use the client's local booru and need to override its host when you copy an external link, this option has moved from options->connection to the local booru's manage services panel. You can also override scheme and port as well! The old host override option is gone completely, and the only other place it was used, the manage upnp dialog, now fetches this info more efficiently and fails more gracefully.
- duplicate filter:
- duplicate action options no longer handle file deletion
- renamed 'not duplicates' across the program to 'not related' or 'false positive'
- 'alternates' and 'not related/false positive' duplicate actions no longer have duplicate action options. no merge content update now occurs on these actions
- the duplicate filter hover panel now splits 'this is better' decisions into two buttons–whether to delete or keep the worse file
- when selecting 'custom action' in the duplicate filter hover panel, it now asks if you would like to delete the current file, the other file, or both
- the 'duplicate_filter_this_is_better' shortcut action will be auto-updated to 'duplicate_filter_this_is_better_and_delete_other'. an alternate 'duplicate_filter_this_is_better_but_keep_both' is now also available
- the 'duplicate_filter_not_dupes' shortcut action will be auto-updated to 'duplicate_filter_false_positive'
- separated the buttons on the duplicate filter hover panel to more carefully split 'yes, files are duplicates' vs other decisions
- in prep for the duplicate db overhaul, refactored all PHash search code and Duplicate management code apart
- misc other prep work for duplicate db overhaul
- file maintenance:
- wrote a new unified manager to handle various long-term file maintenance tasks like regenerating file metadata and thumbnails
- options to govern how this manager can run are now in options->maintenance and processing. you can enable it for idle and shutdown maintenance time and give it a throttle to limit how fast it will work on files, defaulting to 200 per day
- unified the previous db-level attempts at file maintenance to the new system, which supports async job queueing, and moving regen code up to the new manager, out of the db lock
- unified a variety of file and thumbnail regen code to work through the new simpler and saner path
- the right-click->regen thumbnail commands now run through the new manager and no longer need a modal popup. you can keep browsing while they work. they will also not hang the ui as the old system could on big jobs
- when right-click->regenning on more than 50 thumbnails, you now get a dialog asking if you want to do the job now or put it off later
- file maintenance tasks can now run in shutdown time! you will get previews of the jobs with file counts and status progress reports on the shutdown splash
- cleaned up some file extension renaming and dupe-removing code
- in future, I will move the current file integrity check to this new system and have some ui to prompt and set up other big jobs, like fixing various historical misparsing issues
- thumbnail resizing during thumbnail fade that resizes down is now more efficient
- moved the ClientFilesManager to ClientFiles.py
- the rest:
- the 'manage upnp' dialog now moves the duplicated external ip display from the column up to the status text at the top. it fetches the ip after the initial mappings fetch is done. this ip is no longer affected by the external host override option
- cleaned up options->connection page and removed the now defunct external host override option
- the manage services page for the local booru now has optional override for scheme, host, and port for the 'copy external url' function
- fixed an issue with the recent 'collect by' session saving where a restored session that needed a collect was not sorted
- fixed an issue with collections being sorted by approx bitrate
- added a new checkbox to options->sort/collect to set it so the default sort updates every time you choose a new sort anywhere
- fixed an issue with 'remove trashed files from view', which was incorrectly removing on 'all local files' pages
- the 'all local files' file domain, which is frequently confusing to new users, is now no longer an option for new file pages or the autocomplete file domain if the user is not in advanced mode
- the client now searches for versions of urls both with and without a final '/' character when looking up file url import status at the db level and in import lists. system:known_url is unfortunately still an inefficient mess
- improved how the server code deals with some connectionLost errors
- cleaned up and unified some older dialog button code
- fixed a problem in manage tag siblings when petitioning existing pairs and then cancelling when asked for a reason
- fixed a miscount issue when uploading pending tags while many new tags are coming in. progress would sometimes be -754/1,234, ha ha
- db maintenance, repository sync, and file maintenance processing will all now wake on a force idle mode call
- deleted some old code
- misc fixes and cleanup
- some misc gui layout fixes
I have quite a few smaller jobs waiting for me, so other than the new duplicate db tables, that's top priority. Some UI bugs to deal with, maybe some Client API work, an experimental jpeg quality estimator, possibly support for some new filetypes, and hopefully a fun new way to quickly add very complicated OR search predicates thanks to a clever user's work.
Just a note, E3 is coming up soon and I will take my shitposting vacation week for it as usual. I think it'll be 356 that's delayed a week.
ITT we post about our duplicates, tagging, and so on - our stats, goals, landmarks, and so on.
Never give up anons, that tidy catalog of content is waiting for you!
Another World, Another Chance.
We are drawing/tracing every 6th frame from the plane scene to make an animated version. This set of frames will cover from the 2:00 mark ("No one cared who I was…") to about 2:15 ("For you.") of the plane scene.
How can I participate?
1) Find a frame that you would like to draw, e.g. frame 1944.
2) Search the thread to see if anyone has claimed that frame already ("ctrl-f #1944").
3) If it is not claimed yet, post in the thread to reserve it by writing, e.g. "I claim #1944" - make sure to include the pound sign ("#") so that everyone else can ctrl-f and know that it's claimed!
4) Once you have drawn the frame, upload it to the thread and/or website (preferably both), and post in the thread that it is complete: "I have finished @1944". Again, including the @-symbol is important so we can track which ones have been completed.
5) Make absolutely sure you claim and post completion of your drawn frames - if you upload to the website or thread without noting it, someone else will probably grab your prize.
Guidelines for drawings
1) Colour - All frames must be in greyscale. If you post a coloured image it will be converted to greyscale in the final video.
2) Shading - do your best to match the shading in the original frame. If you are doing a line drawing, at least set the background to not be pure white (pick a grey that sort of matches the video) as this will otherwise cause a flickering effect.
3) Quality - you don't have to be an expert to join. Tracings are 100% accepted and approved! Please just make sure that the drawing shows a reasonable mirroring of the actual position of characters/objects in the real frame.
Photo-manipulations: applying a filter, 'cartoonifying' the original frame, etc. will generally not be accepted - it is meant to be a drawing project. However in some cases if everyone likes it we will get it on board (e.g. in the old thread somebody did a DOOM photoshop).
A list of frames is available at the website (becquerel.sdf.org) and I will be posting them ITT as well.
C'mon, don't die on me
>File so big I have to upload one at a time
Hydrus dev, I'm pretty sure I invented the proof to the big bang. If I get popular, I'll shout you out.
That's not why I made this thread, though. I'm so anxious I feel like vomiting. I never said this out loud anywhere, but besides porn I have nothing else in my life besides Hydrus. I use Hydrus extremely casually, anyway. But Hydrus helps sort porn, so… I was gonna shout it out.
I was gonna shout out my two favorite artists, too. I didn't have any plans to shout out anything else… So I imagine it'd be a huge influx of traffic.
I can't handle this at all. I never asked for this at all. No one will believe me until it happens anyway.
I have an appointment to show it to someone next week. It's fucked. I wish I could just be in protective custody and sleep for 20 hours and it'll be better then maybe.
DЯ. Pavel, I'm USSR.
What did he mean by this?
You're a small guy.
Post your sweaty girls please.
Christopher Nolan's New Film Reveals Cast and Title
It's called 'Tenet' — let the speculation begin.
>Aaron Taylor-Johnson, Kenneth Branagh and Michael Caine have joined the cast of Christopher Nolan’s latest movie, Warner Bros. said Wednesday.
>Dimple Kapadia and Clémence Poésy, the French actress who may be best known to American audiences for playing Fleur Delacour in the Harry Potter movies, have also joined the production, whose title has been revealed as Tenet.
>John David Washington is leading the international cast that also includes Robert Pattinson and Elizabeth Debicki. Tenet is described as an action epic evolving from the world of international espionage.
>Tenet is being filmed in seven countries, using a mix of Imax and 70mm film. Nolan and Emma Thomas are producing while Thomas Hayslip is serving as executive producer.
>The below-the line team includes director of photography Hoyte van Hoytema, production designer Nathan Crowley, editor Jennifer Lame, costume designer Jeffrey Kurland, and visual effects supervisor Andrew Jackson. The score is being composed by Ludwig Göransson.
>Tenet has a release date of July 17, 2020.
Sci-fi lewd stuff. you don't have to fuck the xenos.
Let's see how much of a banescholar you are. what do you make of this scene?
post cute girls using computers
What are your thoughts on Japan's bedroom shut-ins known as hikikomori?
Roll for travel buddies.
One roll for one buddy, but they will love you instantly.
3 rolls for 3 buddies, but you have to work for their love
5 rolls for 5 buddies, but you have to work for their love and can only end up with 1 love buddy at the end.
Become a superperson in a Marvel universe.
I had a great week. I finished the first version of the new file maintenance system I had planned, cleaned up the duplicate filter workflow a little more, and fixed a bunch of bugs.
The release should be as normal tomorrow.
I had a good week. .ico files are now supported, 'collect by' status is remembered in gui sessions, and I fixed a bunch of bugs.
duplicate overhaul plans
I started the duplicate overhaul work this week with some planning and experimentation with existing data. My original thought here had been to exactly replicate existing functionality just with a more efficient database schema, but having gone through the various edge-case insertion and merge operations, I believe the current system is overcomplicated for what we are actually using it for.
Most of all, the current system tries to form a chain of 'better/worse' comparisons so all dupes within a 'same file' group are ranked with each other. A decent number of human decisions are needed to determine this ranking, but the data is currently not displayable, and we haven't really noticed that absence. For most practical circumstances, what we really want to determine is what files should actually be considered dupes, and which of those groups is then the best. Most users delete the 'worst' of a pair in any case. Supporting a system that simply tracks a group of duplicates with a single King is more intuitive and reliable, and it is quicker to work with.
This 'King of a group' idea also maps nicely to how we use 'tag siblings'–having a complicated tree or chain of worst tag to best is not as useful as simply replacing all the lesser members of a group with the King as needed. When I get to overhauling tag siblings, I expect to make a similar change.
But in the meantime, for duplicates, I now have a plan. I expect to spend a few more weeks filling out the full details in code, and then I will switch us all over. The existing workflows should remain the same, just with fewer and easier comparisons. I will not do much specific work on file alternates, but they will be feasible to start on once the db overhaul is done. I will also continue to put time into the duplicate filter ui itself. Overall, I feel good about it all. I'd like the whole thing to be done within 8-12 weeks.
otherwise all misc this week
.ico files should now be supported! .cur files (which are basically the same) should work as well.
'collect by' settings are now, finally, saved in page sessions! If your default collect by settings include any ratings services, they will be forgotten on update, so you will have to reset them in options->sort/collect this one time.
I fixed the stupid issue where media viewer hover windows were popping up over manage tags and some other dialogs. This was due to a flaw in the changes from the new always-on-top duplicate hover panel–I apologise for the inconvenience. Some related OS X specific weirdness should be cleaned up as well.
The 'unclose_page' shortcut (default Ctrl+u) now uncloses pages in the correct order!
When a media fits the media viewer exactly (so 100% zoom fits the width or height exactly), the 'zoom switch' action (default 'z') now correctly restores back to 100%!
'open externally' should work better for some custom program paths. The flash projector (for .swf files) was opening without an ui, for instance. If you have had other programs seem to open in the background from open externally calls, please give them another go and let me know if they now work for you.
- the client now supports importing .ico files! (.cur should be supported too)
- finally, 'collect by' is saved for sessions! if your default collect by previously included ratings services, it will forget them this one time–please reset it under the options->sort/collect
- fixed the issue where the media viewer's hover windows were hovering over child dialogs (manage tags, ratings, or known urls)
- improved some os x hover window focus handling for the new always-on-top duplicate action window
- the entries on the 'sort by' list on gui pages are now subcategorised better. it should be a bit easier to find what you are looking for
- the 'sort by file: approximate bitrate' sort option now sorts still images as well by filesize / num_pixels
- to reduce confusion, sort by mime and system:mime are now renamed to 'filetype'
- fixed an issue where the 'unclose_page' shortcut was restoring pages in reverse order (unclosing least-recently-closed-first rather than most-recently-closed-first)
- improved rigour of video framerate estimation
- stopped the video metadata parser from opting to manually frame count videos with size >128MB or num_frames estimate >2,400
- fixed the forced manual frame count to deal with frame counts >9999
- the 'ffmpeg not found' error on file import will now put up a popup message once per boot informing you of this problem more broadly and steps to address it
- fixed some underreporting issues with subprocess_report_mode
- fixed an issue with some yes/no dialogs returning 'no' on escape/window_close_button rather than 'cancel', which affected cancelability some db maintenance questions
- fixed an issue where media that fitted the media viewer canvas width or height exactly at 100% zoom would not respond to zoom switch events to restore non-100% zoom to 100%
- when a local server's CORS mode is turned on, Access-Control-Allow-Origin is now correctly added to GET/POST requests with an Origin request header
- improved reliability of some timestamp rendering code, which should help some users who had trouble opening cookies management page after malformed cookie import
- I believe I fixed an issue with 'open externally' on certain custom paths where the external program could spawn without an ui (flash projector did this). please let me know if your 'open externally' calls start making terminal windows everywhere
- fixed a runtime stability issue with the new duplicates page and slow-updating counts that come in after the page has been deleted
Next week is an 'ongoing' week, where I work on a medium-sized improvement to an existing system. I think I would like to put some time into my 'background file maintenance' plans, unifying the current prototype systems into one that runs nicely in idle time and adding some ui controls for it. There are several pending file-reparsing jobs I would like to queue up (e.g. checking mkvs vs webms with modern file parsing code, and apng discovery, scheduling background thumbnail regen, and fixing bad old frame counts and durations), and if I want to integrate videos into the duplicates system, I'll need a better framework here to schedule that retroactive CPU work. Otherwise I have a couple little cleanup jobs to be getting on with, and I'll start some new duplicates db code.
I can't get enough of them, they're too /cute/.