Category Archives: Code

Software Development

‘bash\r’: No such file or directory. Or, editing unix files on a Windows machine

'bash\r': No such file or directory

What does that mean, you ask yourself? It usually means you are editing *nix script files on a windows system, and then running the files on a *nix machine. For instance in docker, or a VM. Your solution is to use a text editor that allows you to save with unix line endings. `Notepad++` calls it, very reasonably, `’EOL conversion’`. Sublime text calls it `View->Line Endings`.

A further complication if you are using git is that it keeps getting reset back to windows. You could fix this with `git config core.autocrlf false` but that may have other consequences which you don’t want to bother with.

Postgres : Using Integrated Security or ‘passwordless’ login on Windows on localhost or AD Domains

…is slightly less straightforward than you might hope, but helpfully more flexible. For MS SQL Server, integrated security implies that windows user are magically also SQL users and that no password or username is needed to login. But also, that you can no longer choose which user you login as. Postgres is more configurable and more complex. You can specify which users use SSPI and which postgres user(s) each windows user can login as. You can specify, for instance, that you are allowed to use SSPI to login as the postgres superuser.

Here is how you can login with integrated security, as the user postgres, whilst still being able to login as a different user with a password.

  1. Locate and open for editing two files: your pg_hba.conf and pg_ident.conf files. Find them both in the same directory in e.g.
    C:\Program Files\PostgreSQL\data\pg96 or
    C:\Program Files\PostgreSQL\10\data\
  2. In pg_ident.conf add a line to map your windows login, in [email protected] format, to the postgres user named postgres. You can also add other users. Here’s what my lines look like:
    # MAPNAME       SYSTEM-USERNAME         PG-USERNAME
    MapForSSPI     [email protected]    chris
    MapForSSPI     [email protected]    postgres

    (In normal unix style, the columns are separated by any amount of space or tab).

  3. In pg_hba.conf, add lines that allow user postgres to login with integrated security, whilst still allowing all other users to login with passwords. Again, you can add lines for other users too. Don’t forget to put lines targetting specific users above the catchall lines otherwise they will never be reached.
    # TYPE  DATABASE        USER            ADDRESS                 METHOD
    
    #== lines for specific users for SSPI (or anything else) BEFORE the catchall lines ==
    # IPv4 local connections for SSPI:
    host    all             postgres        127.0.0.1/32            sspi 	map=MapForSSPI
    host    all             chris           127.0.0.1/32            sspi 	map=MapForSSPI
    # IPv6 local connections for SSPI:
    host    all             postgres        ::1/128                 sspi 	map=MapForSSPI
    host    all             chris           ::1/128                 sspi 	map=MapForSSPI
    #===================================================================================
    
    # IPv4 local connections:
    host    all             all             127.0.0.1/32            scram-sha-256
    # IPv6 local connections:
    host    all             all             ::1/128                 scram-sha-256
    
  4. Restart the Postgres service, for instance with a powershell command
    Restart-Service 'PostgreSQL 9.6 Server'
  5. Trying logging in as user postgres:
    • psql -h localhost -U postgres
  6. Trying logging in as some other user:
    • psql -h localhost -U someotherusercreatedwithcreaterole
      and you should be prompted for a password (unless you already mastered the pgpass.conf file)

Logging in without specifying a user name

You might expect that SSPI implies not having to specify a username. You would be wrong. Postgres still requires you specify a username when using SSPI, and, as above, allows you to choose which username.

You can however login without a username—with or without SSPI—if there is a postgres user (i.e., a role with LOGIN privilege) with your Windows username (just the name, without the @machinename).

By combining this with the SSPI map above you can then login without typing username or password.

Integrated Security in .Net connection strings

Having done the above I can now use either of

"Host=localhost;Database=MyDb;Integrated Security=True;Username=postgres"
"Host=localhost;Database=MyDb;Integrated Security=True;Username=chris"

as a connection string for the npgsql Ado.Net driver

Reference

https://www.postgresql.org/docs/9.4/static/auth-pg-hba-conf.html and subsequent pages on Authentication methods and the pg_ident.conf file.

Caveats

Why does the title of this post say ‘localhost or AD domains’? Because SSPI only works on Windows; and only on either localhost, or a Windows Domain environment as per https://wiki.postgresql.org/wiki/Configuring_for_single_sign-on_using_SSPI_on_Windows

Which brings us to the alternative that does work remotely without Domain servers: putting passwords in the pgpass.conf file.

Postgres ‘Passwordless’ Login

Storing passwords in plaintext on a windows machine is largely a no-no in most peoples eyes. Unixland is more accepting of it, perhaps because they habitually expect file permissions  to deny access to unauthorised users. And don’t expect to have virusses scanning their machines.

psql.exe on Windows will look for a %appdata%\PostGres\pgpass.conf file (or $env:AppData\PostGres\pgpass.conf for PowerShellers) and will parse lines in this format:

hostname:port:database:username:password

See https://www.postgresql.org/docs/9.1/static/libpq-pgpass.html for some wildcard options such as

localhost:*:*:chris:mydevpassword

Your plaintext password is then protected by Windows file permissions. Which should be fine for passwords to non-production servers.

Postgres quick start for MS SQL / T-SQL developers

After 17 years on T-SQL I’m at last on my first project with Postgres. Here are the first things I needed for the transition:

  • Do this first: https://wiki.postgresql.org/wiki/First_steps as it answers most of your first questions about connections, databases, using roles for logins, users and groups, and so on.
  • 'String concatenation is done with ' || 'a double pipe '.
  • No NVarchar is needed, just varchar. The default installation uses UTF-8. In fact the usual idiom is to use the text type, because omitting the length-specifier performs marginally better on postgres.
  • Date & Time: use timestamp for datetime/datetime2
  • Read https://www.postgresql.org/docs/9.1/static/datatype-datetime.html for more on dates, times and intervals.
  • Use semicolon terminators nearly everywhere. If ... Then ... ; End If ; T-SQL doesn’t need them but Postgres demands them.
  • plpgsql script is strictly separated from “pure” standard SQL in a way that T-SQL just doesn’t bother with. If you are writing scripts, plpgsql must either be inside a function or inside an ‘anonymous function block’. It is always delimited by $$ or $somestring$; and Declare variables in one block before Begin :
    Select 'This line is plain SQL' ;
    Do $$
    Declare
      anumber integer ;
      astring varchar(100);
    Begin
    If 1=1 Then
     Raise Notice 'This block is plpgsql' ;
    End If;
    End $$ ;
    Select 'This line is SQL again' ;
    

    In fact, Postgres functions are defined in a string. The $$…$$ delimiters is actually a postgres syntax for a literal string constant anywhere in code, in which you don’t have to escape any special characters at all. Functions can be defined in other languages than plpgsql. See e.g. https://pgxn.org/dist/plv8/doc/plv8.html.

  • Variables: just use a plain identifier myvariablename with no @ symbol or other decoration. BUT as a consequence you must avoid variable names in a query that are the same as a column name in the same query. BUT BUT in Ado.Net Commands with Parameters, still use the @parametername syntax as you would for SQL Server
  • Use Function not procedure:
    Create or Replace Function procedureName( forId int, newName varchar(20) )
    Returns int -- use returns void for no return
    As $$
    Declare 
      localvariable int;
      othervariable varchar(10);
    Begin
     Insert into mytable (id,name) values (forId, newName) On Conflict (id) Do Update Set Name=NewName ;
    End $$;
  • Ooh, did you notice the ‘Create or Replace’ syntax, and how Postgres has a really good Upsert syntax – Insert On Conflict Update ?
  • Postgres does function overloads, so to drop a function you must give the signature: Drop function functionname(int)
  • Whereas the T-SQLer does everything in T-SQL, other database systems use non-sql commands for common tasks. For working at a command-line, learn about psql meta-commands.. Start with \c to change database, \l to list databases, \d to list relations, and \? to list other meta commands.
  • Postgres have (I think) made more effort than MS to stay close to the SQL standards and as with T-SQL, the docs go into detail on deviations. But this means that much of your code for databases, schemas, users, tables, views, etc can be translated fairly quickly.
  • Replace Identity with Serial or from Postgres 10 you can use the standard-but-wordy GENERATED BY DEFAULT AS IDENTITY. For more complex options, Postgres uses SQL sequences.
  • Postgres replaces logins and database users and groups with a single concept: role. Permit a role to login with the With Login clause when you Create or Alter it: Create Role MyNameHere With Login. Grant it access to a database with Grant Connect: Grant Connect on Database dbname TO roleName. Make it a group with – well, it already is a group. Add other roles to it with Grant Role1 to Role2. I think this approach turns out to be much easier both to evolve and to understand and manage.
  • The Nuget package for Ado.Net, including for .Net core, is Npgsql
  • Scan these

Converting T-SQL to Postgres

Here’s my initial search/replace list for converting existing code:

Search Replace
@variableName _VariableName
but stay with @variable for .Net clients using Npgsql
NVarChar Text or VarChar
datetime
datetime2
timestamp
Identity Serial
The SQL-standardGenerated by default as Identity works from Postgres 10.
Raise Raise Exception
Print Raise Notice
Select Top 10 ... Select ... Fetch First 1 Rows Only ;
Insert Into table Select ... Insert into Table (COLNAMES) Select ...
If Then Begin … End
Else Begin … End
If Then … ;
Else … ; End If ;
Create Table #Name Create Temporary Table _Name
IsNull Coalesce
UniqueIdentifier uuid
NewId() First websearch for Create Extension "uuid-ossp" to install the guid extension to a database. Then you can use uuid_generate_v4() and other uuid functions
Alter Function|Procedure Create or Replace Function

Authorization

Usually you will connect to postgres specifying a username. The psql.exe commandline tool will default it to your windows username. You can avoid passwords in scripts that use psql by putting them in the pgpass.conf file). If you want to set up integrated security, the limitation is that for computers not in a Domain it only works on localhost. The instructions at Postgres using Integrated Security on Windows take about 5 minutes for the localhost case, and include a link to the extra steps for the Domain case.

Conway’s Law & Distributed Working. Some Comments & Experience

The eye-opener in my personal experience of Conway’s law was this:

A company with an IT department on the 1st floor, and a marketing department on the 2nd floor, where the web servers were managed by the marketing department (really), and the back end by the IT department.

I was a developer in the marketing department. I could discuss and change web tier code in minutes. To get a change made to the back end would take me days of negotiation, explanation and release co-ordination.

Guess where I put most of my code?

Inevitably the architecture of the system became Webtier vs Backend. And inevitably, I put code on the webserver which, had we been organised differently, I would have put in a different place.

This is Conway’s law: That the communication structure – the low cost of working within my department vs the much higher cost of working across a department boundary – constrained my arrangement of code, and hence the structure of the system. The team “just downstairs” was just too far.  What was that gap made of? Even that small physical gap raised the cost of communication; but also the gaps & differences in priorities, release schedules, code ownership, and—perhaps most of all—personal acquaintance; I just didn’t know the people, or know who to ask.

Conway’s Law vs Distributed Working

Mark Seemann has recently argued that successful, globally distributed, OSS projects demonstrate that co-location isn’t all it’s claimed to be. Which set me thinking about communication in OSS projects.

In my example above, I had no ownership (for instance, no commit rights) to back end code and I didn’t know, and hence didn’t communicate with, the people who did. The tools of OSS—a shared visible repository, the ability to ‘see’ who is working on what, public visibility of discussion threads, being able to get in touch, to to raise pull requests—all serve to reduce the cost of communication.

In other words, the technology helps to re-create, at a distance, the benefits enjoyed by co-located workers.

When thinking of communication & co-location, I naturally think of talking. But @ploeh‘s comments have prodded me into thinking that code ownership is just as big a deal as talking. It’s just something that we take for granted in a co-located team. I mean, if your co-located team didn’t have access to each other’s code, what would be the point of co-locating?

Another big deal with co-location is “tacit” knowledge, facilitated by, as Alistair Cockburn put it, osmotic communication. When two of my colleagues discuss something, I can overhear it and be aware of what’s going on without having to be explicitly invited. What’s more, I can quickly filter out what isn’t relevant to me, or I can spontaneously join conversations & decisions that do concern me. Without even trying, everyone is involved when they need to be in a way that someone working in a separate room–even one that’s right next door–can’t achieve.

But a distributed project can achieve this too. By forcing most communication through shared public channels—mailing lists, chatrooms, pull request conversations—a distributed team can achieve better osmotic communication than a team which has two adjacent rooms in a building.

The cost, I guess, is that typing & reading is more expensive (in time) than talking & listening. Then again, the time-cost of talking can be quite high too (though not nearly as a high as the cost of failing to communicate).

I still suspect that twenty people in a room can work faster than twenty people across the globe. But the communication pathways of a distributed team can be less constrained than those same people in one building but separated even by a flimsy partition wall.

References