Skip to content

Instantly share code, notes, and snippets.

@hortongn
Created July 11, 2025 13:20
Show Gist options
  • Select an option

  • Save hortongn/86bc6265fc58a0ab33569153162712a7 to your computer and use it in GitHub Desktop.

Select an option

Save hortongn/86bc6265fc58a0ab33569153162712a7 to your computer and use it in GitHub Desktop.
Export Scholar work information including count of files and file sizes
def get_file_size(file)
@total_size = @total_size + file.file_size.first.to_i
end
def get_object(object_id)
object = ActiveFedora::Base.find object_id
if object.class == FileSet
get_file_size(object)
end
end
File.write('/tmp/export.csv', "\"Work Title\",\"PURL\",\"DOI\",\"File Count\",\"Size (GB)\"\n")
total_size_by_type = {}
work_types = [Dataset]
work_types.each do |type|
@total_size = 0
type.all.each do |work|
@total_size = 0
@num_files = 0
work.ordered_member_ids.each do |object_id|
get_object(object_id)
@num_files = @num_files + 1
end
@total_size = @total_size.to_f / 1000000000
collection = ''
unless work.member_of_collections.first.nil?
collection = work.member_of_collections.first.title.first
end
File.write('/tmp/export.csv', "\"#{work.title.first.to_s}\",\"https://scholar.uc.edu/show/#{work.id.to_s}\",\"#{work.doi.to_s}\",\"#{@num_files}\",\"#{@total_size.round(4)}\",\"#{collection}\"\n", mode: 'a')
end
total_size_by_type[type.to_s] = @total_size
end
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment