Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test suite fails when running PGAP #10

Open
SJohnsonMayo opened this issue Feb 17, 2021 · 5 comments
Open

Test suite fails when running PGAP #10

SJohnsonMayo opened this issue Feb 17, 2021 · 5 comments

Comments

@SJohnsonMayo
Copy link

Hi all,

I'm trying to run RAPT on a HPC system running CentOS8 with singularity version 3.5.3 (Docker is not an option for me). I'm able to run the first 5 tests, but a failure on the 6th test.

This is the output I get in concise.log for that test:

[2021-02-17 15:47:54] Starting test #6: Contigs with matching SRA run and taxonomy information in submol file
[2021-02-17 15:47:54] No submol yaml provided. Try to make one from acxn ERR3938362
[2021-02-17 15:47:59] Acquired genus_species=Mycoplasma feriruminatoris
[2021-02-17 15:59:01] Pipeline pgap failed on FASTA file /dkm_output_dir/ERR3938362_sk_out.fa and submol file /dkm_output_dir/.rapt_scratch/ERR3938362_Mycoplasma_feriruminatoris.yaml
[2021-02-17 15:59:01] **FAILED**


[2021-02-17 15:59:01] Test cycle complete, total tests run: 6, total succeeded: 5.
[2021-02-17 15:59:01] Sending PINGER url https://www.ncbi.nlm.nih.gov/stat?ncbi_app=raptdocker&version=2021-01-11.build5132&uuid=fddd345e-7160-11eb-83d6-1866daee9238&evt=test_end
[2021-02-17 15:59:06] status=21
[2021-02-17 15:59:06] Sending PINGER url https://www.ncbi.nlm.nih.gov/stat?ncbi_app=raptdocker&version=2021-01-11.build5132&uuid=fddd345e-7160-11eb-83d6-1866daee9238&evt=rapt_exit

I'm really not sure why this is failing, is there any other information I can provide?

@techshine2018
Copy link
Contributor

Dear SJohnsonMayo,

If you could please attach (drag-and-drop) the verbose.log here it would be extremely helpful. Meanwhile is it possible for you to run the same command (--test) again? I suspect this is just a glitch -- but need to make sure, especially for users who are running non-docker containers.

Thanks so much for trying our product!

Shennan Lu

@SJohnsonMayo
Copy link
Author

Thanks for the fast response! I ran the command again, and got the same error. I've attached the verbose.log file, hopefully there is something useful in there! I also noticed that the console output for this error doesn't seem to be present in the log files, here it is just in case it's useful:

../rapt/run_rapt.py --test
RAPT is now running, it may take a long time to finish. To see the progress, track the verbose log file /research/labs/microbiome/chia/m141127/rapt_test/raptout_a3176de72d/verbose.log.

[2021-02-19 11:17:07] Test cycle complete, total tests run: 6, total succeeded: 5.
[2021-02-19 11:17:07] Sending PINGER url https://www.ncbi.nlm.nih.gov/stat?ncbi_app=raptdocker&version=2021-01-11.build5132&uuid=ca3176de-72d1-11eb-b9d8-1866daee9238&evt=test_end
[2021-02-19 11:17:11] Usage metrics sent to NCBI
[2021-02-19 11:17:11] status=21
[2021-02-19 11:17:11] Sending PINGER url https://www.ncbi.nlm.nih.gov/stat?ncbi_app=raptdocker&version=2021-01-11.build5132&uuid=ca3176de-72d1-11eb-b9d8-1866daee9238&evt=rapt_exit
[2021-02-19 11:17:16] Usage metrics sent to NCBI
INFO:    Cleaning up image...
ERROR:   failed to delete container image /research/labs/microbiome/chia/m141127/singularity_cache/rootfs-173922156: unlinkat /research/labs/microbiome/chia/m141127/singularity_cache/rootfs-173922156/root/infernal.tgz: permission denied

verbose.log

@techshine2018
Copy link
Contributor

Dear SJohnsonMayo,

After reviewing the verbose log (thanks so much for providing it), it seems to me the relevant error message is as follows:

Error: error processing job: (CCoreException::eNullPtr) Attempt to access NULL pointer.
Error memory mapping:/dkm_ref_data/input-2021-01-11.build5132/uniColl_path/blast_dir/NamingDatabase.06.phr openedFilesCount=883 threadID=36
Unable to open a seqid chunk file for reading at /dkm_ref_data/input-2021-01-11.build5132/uniColl_path/cache/seq_id_chunk (errno = 24: Too many open files)
Error: (CSeqDBException::eFileErr) Cannot memory map /dkm_ref_data/input-2021-01-11.build5132/uniColl_path/blast_dir/NamingDatabase.06.phr. Number of files opened: 883;     Stack trace:;      

It looks like the maximal number of open files for singularity is limited. Unfortunately, this is not something we can fix; it is the setting on your system. I suggest you run the command

ulimit -n

If the number is lower than 1024, chances are that's the problem. I suggest you ask your system administrators to set it to a higher number, such as 8192.

Best,

Shennan Lu

@SJohnsonMayo
Copy link
Author

Thank you! I was able to get the test suite to complete by setting the ulimit to the maximum that I can on my own (ulimit -Sn 2000). Unfortunately, I get the same error as before when I try to process my own data. We have a much higher ulimit when using the scheduler, but then I quickly run into network issues. Is there a way to run RAPT "offline"? Or any arguments I can change to get past the ulimit issue?

Thanks again,

Stephen

@techshine2018
Copy link
Contributor

Hi SJohnsonMayo,

Sorry for the long delay. It is indeed possible to run RAPT offline if you can provide the input sequence read in FASTQ or FASTA format and the associated genus name. Do you have an email so that I can send you detailed instructions?

Best,

Shennan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants
-