[–] logos_ethos ago (edited ago)
If it was for the data lengths, I added that with CHECK constraints. Constraints are a lot easier to change after the fact than data types.
If you are porting from a case insensitive database to a case sensitive one, then citext is the way to go. That is the most transparent way of porting the database. Otherwise, you will have a lot of unexpected changes to make. If you were developing on a case sensitive database from scratch, then text or varchar with functions might be better, but not necessarily. You would be adding all of the lower functions, case insensitive operators, and index expressions as you went along. If you use an ORM, citext might be required if the ORM does not know how to properly use PostgreSQL's case insensitive operators. In a little over a year, PostgreSQL might be getting case insensitive collations via ICU Collation, but it will basically be doing the same thing as citext, which is transparently making everything lower case for comparison operations and populating indexes.
Let me know if there is anything that I can do to help. If you want case sensitivity by default, I can help add the required case insensitive operations. This will add time to the port though. I do not know .Net very well, but I can be very useful on the database side. If the ORM has issues with case insensitive operations, then you might have to use citext anyway. I plan on helping with performance tuning. To start, try http://pgtune.leopard.in.ua/ if you have not already. When entering RAM, minus anything else running on the server, such as the web server and OS. Number of Connections is an upper limit, so be generous with it and add a few for the reserved connections for DBAs.
[–] PuttItOut 0 points 2 points 2 points (+2|-0) ago
I think I was actually fairly upset that PG did not honor collations on the text by default. At this point I'm trying to convert all the areas using a common expression so when/if PG does support native case insensitive text types we can replace these extra functions in code easily.
It will be a while until Voat is ready to use PG for a production database so this isn't the most urgent issue we have. We primarily wanted to support PG so that anyone can run Voat on any software deployment. That was our primary goal for supporting another backend option, and so far we have met that goal.
I'd love to run some tests on the citext but I don't have time right now to test this out. All of the scripts are located here if you want to help us with this: https://github.com/voat/voat/tree/Core/misc/SqlScripts/PostgreSql
And you bring up another issue, I don't have admin knowledge with PG.
[–] logos_ethos 0 points 2 points 2 points (+2|-0) ago (edited ago)
Just use citext. That is what other people do, and there is nothing wrong with it. When PostgreSQL adds case insensitive collations, it will be more transparent to migrate from citext to that than with any other solution. citext will be less painful for you now and in the future when you make the switch.
I have voat-citext.patch and voat-tz.patch. Do you want me to create a github account and create a pull request for both? I am fine with someone else taking those patches (@FuzzyWords ?).
If you want to trust me, I can be that admin.
[–] [deleted] ago
[–] PuttItOut 0 points 2 points 2 points (+2|-0) ago
I'm willing to accept I was wrong as my emotions where high when I realized collation actually doesn't matter in PG.
As far as this pull request goes, we do not need to use citext on the voat_users.sql as we use the Core Identity and they handle this via using NORMALIZED__ columns in areas that matter.
So we shouldn't change this schema if I'm thinking correctly.
@logos_ethos
[–] logos_ethos 0 points 1 point 1 point (+1|-0) ago (edited ago)
Collation does matter. It defaults to UTF-8, which is appropriate for web usage. It is also used for sorting, as different locales have different sort orders for their symbols. This affects indexes and comparison operations.
@FuzzyWords
Personally, I would still use citext just to retain the exact behavior of the other database software that you use. Maybe it works now, but you might add something later that would introduce different behavior on the database software that you support.
So I just took a look at Entity Framework. Like I said before, I do not know .Net very well. I am not seeing positive things about Entity Framework and citext. All of the complaints are very old, so maybe the situation has improved.
If the unit tests do not go well, then I can look into the non-citext methods, which will solve bugs as they come in. It will probably leave a few bugs that we cannot find right away. SQL at the application level will have to change, and I do not know Entity Framework well enough to fix that by myself. I will be glad to work with anyone who wants to work with me on it.