Problem with mpirun from MPICH1

We’re using PGI’s MPICH1 on a new installation. Whenever I attempt to do an mpirun as a non-root user (I haven’t attempted it as root.), I get the following:

[dbryan@stj-dmz-hpc1 SMALLBENCH]$ mpirun -machinefile compute_mpi.conf -np 48 ./wrf.exe &
[2] 10759
[dbryan@stj-dmz-hpc1 SMALLBENCH]$ dbryan@compute-0-0’s password:
mypassword
-bash: syntax error near unexpected token `?’

[2]+ Stopped mpirun -machinefile compute_mpi.conf -np 48 ./wrf.exe

First of all, this password request is new to me, having not encountered it on other installations. Second, the password I give is the password I use to log in to the computer itself, though it appears this request is looking for a different password. (And the ‘?’ is the last character in my password.) Third, even if I don’t have the right password, I don’t understand why I’d get a syntax error instead of a password-fail error. (Perhaps the requestor doesn’t accept characters like ‘?.’) Fourth, a non-mpirun of wrf.exe works fine.

I’ve reviewed mpirun --help and searched on-line about this but found nothing. Your help would be appreciated. Thanks!

Hi DSB73,

You must have set-up MPICH to use ssh instead of rsh. ssh requires a password unless you create authentication keys. Doing a web search for “SSH login without password” will give you instruction on how to set this up.

Note that your ssh passphrase may be different from your login password. Though, I’m not sure why you’re getting the syntax error.

  • Mat

Mat,

We’re having the same problem as the original poster. Your previous reply was very helpful as we now understand the problem. However, we’re not sure how to fix it. We’ve created authentication keys for various users into various machines, but we haven’t had any success. On what machine is the ssh originating and what machine is the ssh going to? That way we can set up the appropriate keys.

Thanks,
Andy

Hi Andy,

I can’t claim to be an expert on this, only that I needed to set this up for myself on our internal network. In my case, I just followed instructions I found on the web. I have my keys in my home directory under a “.ssh” directory and all the hosts I use can mount this directory.

I’ll ask one of our IT guys if they have any better advice for you.

  • Mat

Often, the easiest thing to do is to use a script that comes on most modern machines, ssh-copy-id.

Let’s say on you want to passwordless SSH into MachineB from MachineA. So, on MachineA you’ve run, for example:

MachineA: ~/.ssh $ ssh-keygen -t rsa

inside your .ssh directory. Then, inside there, you should have two files, id_rsa and id_rsa.pub. Now run:

MachineA: ~ $ ssh-copy-id -i ~/.ssh/id_rsa.pub username@MachineB

Obviously, it’ll ask for the password this time, but after that, you should be able to ssh into MachineB and no password is needed! (And if you need the other direction, just swap A & B.) ssh-copy-id is nice in that it will set up the .ssh directory on MachineB correctly if you haven’t done a thing. It will “chmod .ssh 700” (as it must be on every machine) and it will also create an authorized_keys file if you don’t have one.

Now, on some clusters, every node might look at the same .ssh directory (since home is, say, NFS mounted). In that case, you’ll often need to add MachineA’s key to MachineA’s .ssh/authorized_keys file:

MachineA : ~ $ cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

Note: if you use ssh-agent, this will probably error out, but ssh-copy-id does work with ssh-agent as well. And if you generated DSA keys, then use the file id_dsa.pub instead.