Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scaled instances and the deletion problem #80

Open
geek-at opened this issue Dec 29, 2018 · 7 comments
Open

Scaled instances and the deletion problem #80

geek-at opened this issue Dec 29, 2018 · 7 comments

Comments

@geek-at
Copy link
Member

geek-at commented Dec 29, 2018

Now the codebase has been rewritten we can start and think about the problem with scaling pictshare: deleting content.

Imagine two Pictshare servers connected through a shared folder (ALT_FOLDER)

An image is requested frequently so both servers have a local copy and there is a copy in the shared folder.

If the user wants to delete the image, it's deleted off the server the user sent the request to and from the shared folder.

the second server never got any info about any deleted hash so they kept theirs.

Possible solutions:

  • Keep a list of deleted hashes in all storage controllers
  • Use some kind of centralized database that manages all hashes and stati
  • Make all nodes somehow communicate with each other
@thomasjsn
Copy link
Contributor

I'm really loving that this app doesn't require a database, a centralized database will introduce some complexity. A list of deleted hashes in all storage controllers and a cron job is quite simple and would do the job. I'm guessing instant deletion is not really required.

@cwilby
Copy link

cwilby commented Nov 19, 2019

Just some thoughts:

Each server maintains a list of peers.

The first server is created (0), then the second server is created (1) and pointed to 0. 0 and 1 both update their lists to [0,1].

For each server added after 1, the server being added is pointed to any server (N). N iterates through every server in its list but itself (If N is 1, this subset is [0]) and sends an HTTP message telling it to add the new server N to their list. Making the list [0, 1, N] on each server.

With this in place, when a server receives a delete request, it performs the delete, then sends a delete signal via HTTP to each server on its list (which should be up to date given the above works).


TL;DR - I agree with making nodes communicate.

@geek-at
Copy link
Member Author

geek-at commented Jan 5, 2020

The problem with all nodes talking to each other is that it would complicate the whole project by a landslide.

I think the easiest way to implement it would be to have a list of deleted hashes that won't get re-used by chance and this list should be copied and checked by all storage providers

@cwilby
Copy link

cwilby commented Jan 6, 2020

Sounds good, where would the deleted hashes be stored? If each node has a copy it would be similarly complex

@geek-at
Copy link
Member Author

geek-at commented Jan 7, 2020

The easiest implementation would be a simple file where delelted hashes are stored

This file should then be compared with the list on every storage controller and every pictshare instance should periodically check this file for hashes to delete. and check storage controllers for updated hashes to add to their local list.

It's just a simple blacklist system. I think that could work.

@cwilby
Copy link

cwilby commented Jan 7, 2020

Yep that sounds like it could work. Each node can be configured to communicate with a service to add/read deleted hashes. Would the service be the root pictshare instance or something else?

@geek-at
Copy link
Member Author

geek-at commented Jan 7, 2020

I'm thinking cronjob so admins can set their own intervals for comparing the blacklist and deletions can take as much time as they need

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants